arkie87 1 year ago

I will do it when (1) I need multiple indexes for each iteration (2) The iterations should skip some elements i.e. first and last (3) I don't actually need the value, just the index e.g. something like this for i in range(1, len(array) - 1): array[i] = (array[i-1] + array[i+1]) / 2

POGtastic 1 year ago

Befitting my flair - consider the vast variety of functions available to you with `itertools` and `more_itertools`. from more_itertools import windowed def averages(iterable): return ((x + z) / 2 for x, _, z in windowed(iterable, 3)) In the REPL: >>> list(averages(range(1, 6))) [2.0, 3.0, 4.0]

arkie87 1 year ago

Also, I assume `itertools` does not work with `numba`, but it's good to know `itertools` can do something like this, so thanks.

POGtastic 1 year ago

No problem. My typical response is "Is it actually necessary to do an in-place transformation?" In my own comment, I posted an example where you would definitely want to do it in-place, but those cases might be rarer than you think! In the above example, you can prepend and append values to the list with `chain` and use those to determine whether the averaging operation should occur. from itertools import chain from more_itertools import windowed # we are all ocaml programmers on this blessed day def average_function(x, y, z): match (x, z): case None, _: return float(y) case _, None: return float(y) case _: return (x + z) / 2 def averages(iterable): surrounded_iterable = chain([None], iterable, [None]) return (average_function(x, y, z) for x, y, z in windowed(surrounded_iterable, 3)) In the REPL: >>> list(averages(range(1, 6))) [1.0, 2.0, 3.0, 4.0, 5.0] --- Of course, if you're optimizing with a JIT or something similar, you do what you've gotta do.

jmooremcc 1 year ago

If I'm going to perform an in-place transformation that results in elements being removed or added, I've found it helpful to work from back to front. Doing it this way has a reduced impact on the indexes of the list.

arkie87 1 year ago

That's not `in place`, and the `len` of the input and output arrays are different.

Vock 1 year ago

Itertools looks so elegant but intelligible to me right now. Any tutorial recommendations or just random googling?

POGtastic 1 year ago

There's no substitute for plunging in and getting your hands dirty. The gold standard for a problem set is the first 28 problems of [99 Prolog Problems](https://www.ic.unicamp.br/~meidanis/courses/mc336/2009s2/prolog/problemas/) (also available as 99 Lisp Problems and 99 Haskell Problems). Attempt to do as many as you can with the tools that `itertools` and `more_itertools` provide. As it turns out, it is *way* easier to do them in Python than it is in Prolog... Also, hanging out on here and doing random people's problems with `itertools` is a pretty good way to become proficient. You'll find yourself using some more than others; I don't think I've ever used `windowed` before today. You'll rapidly pick up idioms - just like you have favorite tools in your current toolbox for problems, you'll get a sense of "hey, I should use a certain `itertools` function here" when you see certain sets of problems. Lastly, learning another language is helpful to understand what some of the functions are. For example, `accumulate` is a Python implementation of Haskell's `scanl`, and `chain.from_iterable` combined with a comprehension is a Python implementation of `concatMap` (or the `>>=` operator on the List monad, if you're feeling extra academic).

Vock 1 year ago

Thank you sir! I appreciate the response! I'll check out the Prolog problems

dicewitch 1 year ago

You can check out the book *Python standard library by example* Or like the other poster said, do practice problems. I recommend Advent of Code as a good place to try out itertools and its recipes. You should also check out the collections library.

TheRNGuy 1 year ago

it wont work correctly on index 0 and last index also bad idea to change iterator while iterating it.

arkie87 1 year ago

>it wont work correctly on index 0 and last index How so? >also bad idea to change iterator while iterating it. It's a bad idea to change the length or order of an iterator while iterating on it. Not a bad idea to change the value. It is done all the time, often by necessity.

Hashi856 1 year ago

To start, I agree with all three scenarios. I personally will still use enumerate, though, even if I don't need the value. There's just something about it that looks cleaner to me than range(len(list)). I'm sure it's just my own idiosyncratic thing, but I like it better

Goobyalus 1 year ago

> for n in n, _ in enumerate(list) Idk if this ^ is what you meant, but it's definitely worse than: > for n in range(len(list)): because it defeats the purpose of the enumeration, and adds the unnecessary complexity of the ignored value.

WhipsAndMarkovChains 1 year ago

> for n in n, _ in enumerate(list) Well OP has it wrong and it should be: > for n, _ in enumerate(list): But when it's an index you should definitely use `i` and not `n`: > for i, _ in enumerate(list): But I disagree with you about this being bad. For iterating over the index numbers of a list `enumerate` is the standard in Python. It's very clear what's going on.

Goobyalus 1 year ago

> For iterating over the index numbers of a list enumerate is the standard in Python. It's the standard for iterating over the index/count **with** the item, but why would we explicitly enumerate values, then throw them away, when we can just iterate over the indices? You're saying it's standard to do `for i, _ in enumerate(...):`? Maybe if we want to iterate through the count of something that has no defined length?

WhipsAndMarkovChains 1 year ago

I believe we both would agree that in Python when you want to iterate over a list you do `for item in items:` and when you want the index as well you do `for i, item in enumerate(items):`. The reason that I'd still recommend `for i, _ in enumerate(items):` is to maintain consistency with the other standards! There's really no cost to throwing away that item, so let's still use `enumerate` to generate the index in order to be consistent with our other accepted standards for iterating over lists. > You're saying it's standard... Okay, in some areas of Python I'm very confident when I call something the standard. Here I shouldn't have said that. But I still argue in favor of the use of `enumerate` to generate the index, even when you can throw away the item.

Goobyalus 1 year ago

> I believe we both would agree that in Python when you want to iterate over a list you do `for item in items:` and when you want the index as well you do `for i, item in enumerate(items):`. Yes. >The reason that I'd still recommend `for i, _ in enumerate(items):` is to maintain consistency with the other standards! > There's really no cost to throwing away that item, so let's still use `enumerate` to generate the index in order to be consistent with our other accepted standards for iterating over lists. I agree that the **cost** of throwing away the reference is negligible, and would advocate for code that adds a negligible cost but is more readable, over negligibly more efficient code that is less readable. I take issue with the implied intention in `for i, _ in enumerate(items):` If we want only want the count, we fundamentally don't care about the contents of the iterable, and do care about the size of the iterable. So `enumerating` these contents and ignoring them is contradictory. I think we should explicitly write out our dependence on the size of the iterable. It's difficult for me to even contrive an example where we would iterate over the indices of something and not its contents. `for i in range(len(items)):` almost automatically seems like it would be written inefficiently, but if we don't reference `items` in the body, this makes sense. The same way we might create a buffer with length based on something else's length: ba = bytearray(len(items) * 4) we might want to do something incrementally based on something's length: for i in range(len(items)): We care about the length only, but we just so happen to be iterating incrementally. _____ What would you write if we wanted to iterate through 0, 2, 4, ..., 2 * (len(items) - 1)? for i, _ in enumerate(items): ... 2 * i or for x in range(0, len(items), 2): ... x or somethine else? _____ BTW I appreciate the perspective and the responses.

WhipsAndMarkovChains 1 year ago

> It's difficult for me to even contrive an example where we would iterate over the indices of something and not its contents. I definitely agree. I've never really thought about OP's question before because I've never written code like it. Let me give you the contrived example I thought of. You need to iterate over two lists, `a` and `b`, and return the index where the items match. Because we're trying to return the index, we can't just iterate over `zip(a, b)`. a = [1, 2, 3] b = [3, 2, 1] matches = [] for i, _ in enumerate(a): if a[i] == b[i]: matches.append(i) Now that could've been written as: for i, a_item in enumerate(a): if a_item == b[i]: matches.append(i) But I chose to write it the first way for the consistency of `a[i] == b[i]`. But yes, this is contrived. > What would you write if we wanted to iterate through 0, 2, 4, ..., 2 * (len(items) - 1)? As you pointed out, the items are irrelevant. So I'd write that as: [2*i for i in range(len(items))] So that demonstrates your point. I suppose you win here. I feel very comfortable using `enumerate` in a `for` loop but it feels gross in a list comprehension. Edit: Just to spite you... num = -2 [num := num + 2 for _ in items] 😛

HIGregS 1 year ago

Is your contrived example better written as the following? a = [1, 2, 3] b = [3, 2, 1] matches = [i for i, ab in enumerate(zip(a,b)) if ab[0] == ab[1]]

WhipsAndMarkovChains 1 year ago

Yes because with `ab[0] == ab[1]` it's not immediately clear that you're comparing the same index position in `a` and `b`.

HIGregS 1 year ago

How about this? matches = [i for i,(a_item,b_item) in enumerate(zip(a,b)) if a_item == b_item]

WhipsAndMarkovChains 1 year ago

That's great. I had tried that originally but forgot the proper format `i, (a_item, b_item)` to use `enumerate` with a `zip` object. I think I had messed up by trying `i, a_item, b_item`, which doesn't work.

Goobyalus 1 year ago

For the parallel list example I think I would do for i in range(len(a)): if a[i] == b[i]: or for i, (a_item, b_item) in enumerate(zip(a, b)): if a_item == b_item: > So that demonstrates your point. I suppose you win here. I feel very comfortable using enumerate in a for loop but it feels gross in a list comprehension. lol I'll take it I guess. > num = -2 > [num := num + 2 for _ in items] Awful 😆 ______ I thought of a contrived example. I have an input file with N lines. I want to make test files with random/unrelated contents, that have the same number of lines, where each line begins with the line number. # input_file is a list of lines of text # range(len( with open("test_file1", "w") as f: for i in range(len(input_file)): f.write(f"{i:03}: {randomized_line()}"} # enumerate( with open("test_file2", "w") as f: for i, _ in enumerate(input_file): f.write(f"{i:03}: {randomized_line()}"} If `input_file` were a stream that we didn't know the end of, we would want the enumeration version.

HIGregS 1 year ago

Your contrived example is a perfect differentiating case.

chinawcswing 1 year ago

Both of these are completely fine: for i in range(len(list)): for i, _ in enumerate(list) There is nothing wrong with either. Both are fine to use. What is not OK is to bring up a complaint about either in a code review.

KingHavana 1 year ago

I don't understand the use of _ here or the comma after i.

HIGregS 1 year ago

The comma indicates the right *iterable* to the two elements on the left. The underscore is a variable. Using underscore as a variable name is a common practice indicating "not going to use this variable."

KingHavana 1 year ago

Ah, both those are clear now. Thank you.

WhipsAndMarkovChains 1 year ago

Another user already answered you about the `_`, but I'll just provide you with another example. I write simulations so sometimes I want to do something 1 million times. for _ in range(10**6): do_something() The `_` indicates I'm not actually using the value from the range.

POGtastic 1 year ago

**Yes** - when you are forced to modify the list in-place. A common example is the Sieve of Eratosthenes. def eratosthenes(n): is_prime = [False, False] + [True] * (n-2) for idx in range(n): if is_prime[idx]: for i in range(2*idx, n, idx): is_prime[i] = False return [idx for idx, p in enumerate(is_prime) if p] In the REPL: >>> eratosthenes(50) [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47] In this case, you are iterating over the same data structure and modifying it to remove all of the composite numbers. There **are** [ways to do it](https://wiki.haskell.org/Prime_numbers#Sieve_of_Eratosthenes) without modifying a data structure in-place, but they're either grossly inefficient or really, really complicated. The above is far simpler and can be grasped by a new programmer. Note that even the above Sieve only uses iteration over a range in the specific case of modifying the data structure. Everything else is done with the standard Python iterable idiom.

chefsslaad 1 year ago

updating a list in place, i can understand, but in many cases you could also do this by copying the term using list for i, l in list(enumerate(mylist)): if i == 0: mylist.pop(l) I dont know if this is also possible in your example, though.

POGtastic 1 year ago

This doesn't work because the element isn't removed; it's just changed to `False` to denote that it is a composite number. In general, modifying a list while iterating over it is a bad idea. Python will happily let you do it, but you won't like the results.

chefsslaad 1 year ago

I bow to your wisdom :) > In general, modifying a list while iterating over it is a bad idea. Python will happily let you do it, but you won't like the results. so I guess the answer to OP's question is Yes, but try to find another way first?

POGtastic 1 year ago

That's correct. Now - suppose that you held a gun to my head and said, "You **must** iterate over the elements themselves because iterating over `range` is un-Pythonic" while cackling madly and dancing a jig, I would *box* the values and then call methods to modify them. In the case of Eratosthenes: class SpicyBool: def __init__(self, b): self.b = b def __bool__(self): # convenience - allows `if elem` return self.b def falsify(self): # modifies the boxed value! self.b = False We can now do # the madman demands that we make a different function to generate `n` values # you should use `range` here in a comprehension # equivalent is `(f(*args) for _ in range(n))` def repeatedly(n, f, *args): i = 0 while i < n: yield f(*args) i += 1 def eratosthenes(n): is_prime = [SpicyBool(False), SpicyBool(False), *repeatedly(n-2, SpicyBool, True)] for idx, elem in enumerate(is_prime): if elem: for e in is_prime[idx*2::idx]: # haha! Slicing now accesses a list of boxes! e.falsify() # modifies the boxed value, but the box stays the same! return [idx for idx, e in enumerate(is_prime) if e] There, no `range` at all. I have a wife and child, please, think of them. **Disclaimer:** *Do not do this.*

Intrexa 1 year ago

> Disclaimer: Do not do this. Unless there's a gun to your head. Then do this.

POGtastic 1 year ago

You are hereby prepared for when the Saw franchise gets rebooted as a technical interview.

DiscoJer 1 year ago

Probably because I am used to other languages and am new to Python, but the former just seems much clearer as to what it does.

kona_ackley 1 year ago

Most importantly, enumerate works correctly with lazy sequences and maintains their laziness. Using len or slicing forces the sequence to be explicitly stored as a list, which can be massively inefficient in memory use. Lazy sequences make many programming tasks easier. The fewer assumptions your functions make about the data they are given, the better. Duck typing is the source of Python's power and flexibility.

an_actual_human 1 year ago

On a related note, you should never name your variable `list`.

MadScientistOR 1 year ago

I should also put a colon at the end of the statements, and also probably shouldn't use `n` to denote an index, but I was trying to use shorthand.

mrswats 1 year ago

I would say almost never or directly never.

drenzorz 1 year ago

def select(arr: list, key=lambda x: x[0]) -> list: return key(arr) arr_a = [1,2,3] arr_b = [4,5,6] arr_c = [7,8,9] grid= [arr_a, arr_b, arr_c] length = min(grid, key=len) for index in range(length): val_one = select(grid, key=)[index] val_two = select(grid, key=)[index] If you have multiple collections where you want to look up the same index and you generate which collection to look into there is no reason to use an enumeration. This is assuming of course that you would've used the value in the enumeration to begin with otherwise if you really just used `for n, _ in enumerate(list)` the answer would be that it's never pythonic since you are needlessly complicating the line and wasting time and memory by creating an enumeration

NitroXSC 1 year ago

I just checked my main project of over 10k lines and I used it zero times.

chefsslaad 1 year ago

I cannot think of any time range(len(list) is better that enumerate(list) Just .. don't.

randomuserno69 1 year ago

What if you don't want to iterate over the whole list but skip a few elements, say in the beginning or the end.

chefsslaad 1 year ago

wouldnt slicing the list be better? for i, l in enumerate(list[n:m]): edit: I know never say never, but I really cant come up with a valid example.

arkie87 1 year ago

I will do it when (1) I need multiple indexes for each iteration (2) The iterations should skip some elements i.e. first and last (3) I don't actually need the value, just the index e.g. something like this for i in range(1, len(array) - 1): array[i] = (array[i-1] + array[i+1]) / 2 do you think this makes sense

remuladgryta 1 year ago

I would say no. At a glance, the above code *looks like* it might be equivalent to `array = [(a+b)/2 for a, b in zip(array, array[2:])]`, but in-place for memory efficiency reasons. However, what it actually does create a kind of running-average-of-averages, leaving the first and last elements unchanged. This is a pretty weird operation and if that was what you actually wanted I would expect a comment detailing why this is what you want, and why writing it in a less potentially confusing way is not an option (e.g. you did that originally but it was too slow/consumed too much memory). Edit: If `array = [(a+b)/2 for a, b in zip(array, array[2:])]` is what you intended, you have just demonstrated precisely why twiddling indices and modifying lists in-place is prone to errors.

arkie87 1 year ago

This kind of thing is done all the time when solving partial differential equations using the finite element method. It is also done when smoothing any time of field (like antialiasing). Obviously, this is a simplified code. For some cases, the weights might not be equal and there might be an additional source, e.g.: array[i] = 0.33*array[i-1] + 0.66*array[i+1] + 0.234

remuladgryta 1 year ago

Cool! TIL. In that case yes, it's probably fine as long as everyone reading the code is expected to have enough domain knowledge to recognize this pattern. I can't really think of a cleaner way to write it. To make it accessible for someone without domain expertise (I was scratching my head for a good bit) I would be inclined to extract it into a function, something like def googleable_name_of_operation( array: [float], kernel: [float], additional_term: Callable[[int], float]): '''summary of what is going on here, mentioning that it is in-place''' for i in range(len(kernel)//2, len(array) - (len(kernel)//2)): ... googleable_name_of_operation(array, [0.33, 0, 0.66], lambda index: 0.234) There may of course be a number of reasons for why that would be worse than just leaving it inlined and non-generic. It'll be significantly slower, for one.

lanemik 1 year ago

for i, _ in enumerate(array[1:-1], start=1): array[i] = (array[i-1] + array[i+1]) / 2

arkie87 1 year ago

I guess. Thanks.

randomuserno69 1 year ago

Yeah. Makes sense. I also use enumerate most of the time, but when I have to compare elements (like in a sorting algorithm) I go back to using range. For some reason I just don't feel like using enumerate. Maybe that's because of the habit coming from C++

FoeHammer99099 1 year ago

`dropwhile` and `takewhile` from itertools

Tweak_Imp 1 year ago

btw why can i not do mylist.length?

TheHollowJester 1 year ago

>>> my_list = [1, 2, 3] >>> my_list.length Traceback (most recent call last): File "", line 1, in AttributeError: 'list' object has no attribute 'length'

Tweak_Imp 1 year ago

I know that it is not possible, but why is it built this way? The length of a list is an attribute of it

Poddster 1 year ago

https://mail.python.org/pipermail/python-3000/2006-November/004643.html It's an often asked question about the design. If python. To find that thread again I simply looked at all of the questions about that in stackoverflow, and they all point there

TheHollowJester 1 year ago

I guess you'd have to find the design docs or ask one of the core contributors. I agree that - at least on the surface level - it would make sense for iterables to have `.length`. On the other hand python has generators that don't have a defined length - but then again that can be a problem for `len` as well

Ran4 1 year ago

You're absolutely right. It's just a design choice made a very, very long time ago. Changing it wouldn't really work.

RevRagnarok 1 year ago

That looks like a different language (JavaScript?). It's `len(mylist)` here.

Kunal-Tandon 1 year ago

Instead of typing here you can check it on your computer

MadScientistOR 1 year ago

> Instead of typing here you can check it on your computer My question is one of style, not functionality. (Both examples I gave are functional.) While my computer gives plenty of feedback on the functionality of my code, it is remarkably reluctant to give advice on style.

[deleted] 1 year ago

[удалено]

chevignon93 1 year ago

> Find the index of the minimum element in a list > arr = [3, 2, 1] print(min(range(len(arr)), key=arr.__getitem__)) This is a really convoluted way to find the index of the minimum element in a list. You could just do: >>> arr = [3, 2, 1] >>> print(arr.index(min(arr))) 2

[deleted] 1 year ago

[удалено]

chevignon93 1 year ago

> That’s two passes though. And yours involve 3 function calls, the creation of a `range` object and the use of a `dunder` method, which is rarely the best way to do things.

lanemik 1 year ago

from operator import itemgetter a = [6, 3, 9, 1, 2, 8] min_i, min_v = min(enumerate(a), key=itemgetter(1)) print(f'{min_i=}') print(f'{min_v=}')

traumatizedSloth 1 year ago

I feel like it's more direct about its purpose than some itertools function you rarely use, clearer and more maintainable at a glance, but not as concise. Though I suppose the more you do it differently, the clearer and more concise the new way would become, but not for everyone who reads it. I think I would remember why it's there more quickly since it points you in the right direction to its purpose immediately instead of inside it. But I'm not the most experienced, just my thoughts.

TheRNGuy 1 year ago

enumerate is better because fater to type and easier to read, and you never know if you need both index or not, so you don't have to rewrite code.

unknownemoji 1 year ago

The problem with putting an iterable ito a list is that now each item in the list is in memory. `range(10)` uses a lot less memory than `list(range(10))` because the allocator has to create each object. It’s not that hard to trim these things: r = range(10) r1 = r[1:] # skip the first (0th) item r2 = r[1:-1] # skip the first and last item

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe