Reference

Have you ever had to work with a dataset so large that it overwhelmed your machine’s memory? Or maybe you have a complex function that needs to maintain an internal state every time it’s called, but the function is too small to justify creating its own class. In these cases and more, generators and the Python yield statement are here to help.

By the end of this course, you’ll know:

• What generators are and how to use them
• How to create generator functions and expressions
• How the Python yield statement works
• How to use multiple Python yield statements in a generator function
• How to use advanced generator methods
• How to build data pipelines with multiple generators

Understanding Generators

Generators return the value then stop.

If you have multiple yield statement in one function, every next return one of the yield statement.

For loop.

If you exceed the range.

Examples

Implement my_enumerate

Write your own generator function that works like the built-in function enumerate.

Calling the function like this:

should output:

Solution:

Chunker

If you have an iterable that is too large to fit in memory in full (e.g., when dealing with large files), being able to take and use chunks of it at a time can be very valuable.

Implement a generator function, chunker, that takes in an iterable and yields a chunk of a specified size at a time.

Calling the function like this:

should output:

Better solution:

Here we don’t have to worry about stop is out of range since range handles it.

Worse solution:

Generator Comprehension

List comprehension.

Generator comprehension

See memory difference.

The Python Profilers

Once you’ve learned the difference in syntax, you’ll compare the memory footprint of both, and profile their performance using cProfile.

Conclusion

• Generators are iterable objects
• They keep state between calls, meaning they can remember a place in a sequence without holding the entire sequence in memory.
• They save memory, but are slower than other lterables, so there is a tradeoff.

Using Advanced Generator Methods

• .send()
• .throw()
• .close()

In this lesson, you’ll learn about the advanced generators methods of .send(), .throw(), and .close(). To practice with these new methods, you’re going to build a program that can make use of each of the three methods.

As you follow along in the lesson, you’ll learn that yield is an expression, rather than a statement. You can use it as a statement, but you can manipulate a yielded value. You are allowed to .send() a new value back to the generator. You’ll also handle exceptions with .throw() and stop the generator after a given amount of digits with .close().

1. We got the 1st palindrome 11 and pal_gen send 10 ** len(str(11)) which is 100 to i = (yield num)
2. Now i = 100, and the next palindrome is 111.
3. pal_gen send 10 ** len(str(111)) which is 1000 to i = (yield num)

throw

close

Creating Data Pipelines With Generators

In this lesson, you’ll learn how to use generator expressions to build a data pipeline. Data pipelines allow you to string together code to process large datasets or streams of data without maxing out your machine’s memory.

For this example, you’ll use a CSV file that is pulled from the TechCrunch Continental USA dataset, which describes funding rounds and dollar amounts for various startups based in the USA. Click the link under Supporting Material to download the dataset included with the sample code for this course.

TechCrunch