diff --git a/README.md b/README.md index bce22d9..975dc0c 100644 --- a/README.md +++ b/README.md @@ -2,9 +2,20 @@ forked from https://github.com/KrzysztofSajdok/parser_OLX simple parser used to retrieve amount of sale, rent and exchange avertisements from OLX.pl in real estate category. +Installation: ``` -python -m venv .venv && source .venv/bin/activate -pip install -r requirements.txt +python3 -m venv .venv && source .venv/bin/activate +pip3 install -r requirements.txt +sudo apt install chromium +chromium --version +``` +Then download the appropriate version of `chromedriver` from [this webpage](https://chromedriver.chromium.org/downloads), extract it, give execution permissions and put it in .venv/bin/. Here I'm downloading chromedriver for chromium 112 +``` +wget "https://chromedriver.storage.googleapis.com/112.0.5615.49/chromedriver_linux64.zip" +unzip chromedriver_linux64.zip +mv chromedriver .venv/bin +chmod +x .venv/bin/chromedriver +rm LICENSE.chromedriver chromedriver_linux64.zip ``` `main.py` puts data in the database in the `olx_data` table. diff --git a/requirements.txt b/requirements.txt index 45f36d3..2992d3a 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,30 +1,4 @@ -async-generator==1.10 -attrs==21.4.0 -beautifulsoup4==4.11.1 -bs4==0.0.1 -certifi==2022.6.15 -cffi==1.15.1 -charset-normalizer==2.1.0 -cryptography==37.0.4 -h11==0.13.0 -idna==3.3 -lxml==4.9.1 -outcome==1.2.0 -pybrowsers==0.5.0 -pycparser==2.21 -pyOpenSSL==22.0.0 -PySocks==1.7.1 -python-dotenv==0.20.0 -pyxdg==0.28 -requests==2.28.1 selenium==4.3.0 -sniffio==1.2.0 -sortedcontainers==2.4.0 -soupsieve==2.3.2.post1 -trio==0.21.0 -trio-websocket==0.9.2 -urllib3==1.26.10 -webdriver-manager==3.8.0 -wsproto==1.1.0 +bs4 unidecode matplotlib