Created
June 12, 2026 16:19
-
-
Save davidmezzetti/b469bdd8c601dd8659c3b1d3e739152a to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # pip install txtai-minimal beautifulsoup4 | |
| # pip freeze | |
| # beautifulsoup4==4.15.0 | |
| # soupsieve==2.8.4 | |
| # txtai_minimal==9.10.0 | |
| # typing_extensions==4.15.0 | |
| # du -hs /python | |
| # 19M /python | |
| from txtai import Textractor | |
| textractor = Textractor(sections=True) | |
| for x in textractor("https://github.com/neuml"): | |
| print("SECTION", x) | |
| # SECTION **NeuML · GitHub** | |
| # | |
| # *NeuML is the company behind txtai, one of the most popular open-source AI frameworks in the world. - NeuML* | |
| # SECTION NeuML is the company behind txtai, one of the most popular open-source AI frameworks in the world. | |
| # | |
| # We are building a suite of applications to make it easy to integrate AI into production. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment