wiseflow/core
bigbrother666 c20c4a0a27 little fix
2024-06-21 13:55:25 +08:00
..
insights little fix 2024-06-21 13:55:25 +08:00
llms code review 2024-06-14 09:08:12 +08:00
pb add scripts 2024-06-20 15:01:27 +08:00
scrapers little fix 2024-06-21 13:55:25 +08:00
scrips add scripts 2024-06-20 15:01:27 +08:00
utils mulity-language readme 2024-06-16 20:42:01 +08:00
backend.py update README 2024-06-21 10:05:33 +08:00
docker_entrypoint.sh add scripts 2024-06-20 15:01:27 +08:00
README.md scrapers updated 2024-06-19 10:05:10 +08:00
requirements.txt add scripts 2024-06-20 15:01:27 +08:00
tasks.py update README 2024-06-21 10:05:33 +08:00

For Developer Only

conda create -n wiseflow python=3.10
conda activate wiseflow
cd core
pip install -r requirements.txt
  • tasks.py background task circle process
  • backend.py main process pipeline service (based on fastapi)

WiseFlow fastapi detail

{'user_id': str, 'type': str, 'content':str 'addition': Optional[str]}
# Type is one of "text", "publicMsg", "site" and "url"
# user_id: str
type: Literal["text", "publicMsg", "file", "image", "video", "location", "chathistory", "site", "attachment", "url"]
content: str
addition: Optional[str] = None`

see more (when backend started) http://127.0.0.1:7777/docs

WiseFlow Repo File Structure

wiseflow
|- dockerfiles
|- ...
|- core
    |- tasks.py
    |- backend.py
    |- insights
        |- __init__.py  # main process
        |- get_info.py  # module use llm to get a summary of information and match tags
    |- llms # llm service wrapper
    |- pb  # pocketbase filefolder
    |- scrapers
        |- __init__.py  # You can register a proprietary site scraper here
        |- general_scraper.py  # module to get all possible article urls for general site 
        |- general_crawler.py  # module for general article sites
        |- mp_crawler.py  # module for mp article (weixin public account) sites
   |- utils # tools

Although the general_scraper included in wiseflow can be applied to the parsing of most static pages, for actual business, we still recommend that customers to write their own crawlers aiming the actual info source.

See core/scrapers/README.md for integration instructions for proprietary crawlers