Python scrapy uses pyinstaller to convert to exe files

Python scrapy uses pyinstaller to convert to exe files … here is a solution to the problem.

Python scrapy uses pyinstaller to convert to exe files

I’m trying to convert a scrapy script to an exe file.
The main.py file looks like this:

from scrapy.crawler import CrawlerProcess
from amazon.spiders.amazon_scraper import Spider

spider = Spider()
process = CrawlerProcess({
    'FEED_FORMAT': 'csv',
    'FEED_URI': 'data.csv',
    'DOWNLOAD_DELAY': 3,
    'RANDOMIZE_DOWNLOAD_DELAY': True,
    'ROTATING_PROXY_LIST_PATH': 'proxies.txt',
    'USER_AGENT_LIST': 'useragents.txt',
    'DOWNLOADER_MIDDLEWARES' : 
    {
        'rotating_proxies.middlewares.RotatingProxyMiddleware': 610,
        'rotating_proxies.middlewares.BanDetectionMiddleware': 620,
        'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware': None,
        'random_useragent. RandomUserAgentMiddleware': 400
    }
})

process.crawl(spider)
process.start() # the script will block here until the crawling is finished

The scrapy script looks like any other script. I’m using pyinstaller.exe --onefile main.py to convert it to an exe file. When I try to open the main.exe file in the dist folder, it starts to output the error:

FileNotFoundError: [Errno 2] No such file or directory: '...\\scrapy\\VERSION'

I CAN FIX IT BY CREATING A SCRAPY FOLDER IN THE DIST FOLDER AND UPLOADING A VERSION FILE FROM LIB/SITE-PACKAGES/SCRAPY.
After that, there were many other errors, but I could fix them by uploading some scrapy libraries.

Finally start outputting error:

ModuleNotFoundError: No module named 'email.mime'

Not sure what that means. I’ve never seen it.

I’m using:

Python 3.6.5
Scrapy 1.5.0
pyinstaller 3.3.1

Solution

I had the same situation.
Instead of trying to get pyinstaller to calculate this file (all my attempts failed), I decided to check and change some parts of the scrapy code to avoid this error.

I noticed that there is only one place \scrapy\VERSION files used — \scrapy\__init__.py
I decided to hardcode the value from scrapy\version by changing scrapy__init__.py
:

#import pkgutil
__version__ = "1.5.0" #pkgutil.get_data(__package__, 'VERSION').decode('ascii').strip()
version_info = tuple(int(v) if v.isdigit() else v
                     for v in __version__.split('.'))
#del pkgutil

With this change, there is no need to store versions in external files.
This error does not occur because the \scrapy\version file is not referenced.

After that, I came across the same \scrapy\mime.types FileNotFoundError: [Errno 2] file.
The same is true for \scrapy\mime.types – it is only used in \scrapy\ responsetypes.py.

...
#from pkgutil import get_data
...
    def __init__(self):
        self.classes = {}
        self.mimetypes = MimeTypes()
        #mimedata = get_data('scrapy', 'mime.types').decode('utf8')
        mimedata = """
        Copypaste all 750 lines of \scrapy\mime.types here
"""
        self.mimetypes.readfp(StringIO(mimedata))
        for mimetype, cls in six.iteritems(self. CLASSES):
            self.classes[mimetype] = load_object(cls)

This change addresses FileNotFoundError: [Errno 2] via the \scrapy\mime.types file.
I agree that hardcoding 750 lines of text into Python code is not the best decision.

Then I started getting ModuleNotFoundError: No module named scrapy.spiderloader. I added “scrapy.spiderloader" to pyinstaller's hidden import parameters.

Next ModuleNotFoundError: There is no module named scrapy.statscollectors.

The final version of the pyinstaller command for my scrapy script contains 46 hidden imports – after that I received the working .exe file.

Related Problems and Solutions