Skip to main content




































If you're stupid and you know it, write a bill.πŸ‘πŸ‘
If you're stupid and you know it, write a bill....

cbsnews.com/news/greenland-ran…



100% legal


Don't worry when you see scrape it does a HTTP/GET-POST only on sites that matter for my LLM.
Hardcoded fetching to text file , no aggressive scraping (e.g., rapid requests, full mirror downloads)
Guthenberg's site even allow 1 book per time to .txt, also bot's
It uses a single, polite request with a proper User-Agent
It does not hammer the server just one file, once and gone from there.
This has No telemetry, no cloud, no external logging
Only fetches public domain content
It runs locally as all all libraries used (PyQt6, requests, BeautifulSoup) are open-source and permissively licensed ; )
This is a private, offline, ethical knowledge assistant.
meet the monster

Only 3 of the 12 widgets need a live LLM server open.

in reply to plan-A ᕦ(Γ²_Γ³Λ‡)α•€

in reply to adingbatponder

in reply to plan-A ᕦ(Γ²_Γ³Λ‡)α•€

@plan-A ᕦ(Γ²_Γ³Λ‡)α•€

You know that Project Gutenberg offers all of their books in text form right? For that site you would not need to convert it. For other sites converting html to text is quite simple with pandoc

pandoc -f html -t plain input.html -o output.txt

pandoc also handles the conversion of many other document type formats as well.

This entry was edited (2 days ago)
in reply to Unus Nemo

@Unus Nemo yes I know the HTML > .txt works only when needed from the general app, tab guthenberg is for their books the search bar as for RFC's if you look here but be fast I set it to auto delete for @adingbatponder after 1 day you see a html converted to text ibb.co/N6G2Nn5r
I have filled it with man pages from all codes, how to use those codes, networking in a broad part, developing a whole list was busy for hours so the amount of man .txt pages is huge and all is set so that my LLM adhere to those manuals once I fetch it out as fetch all manuals (did some changes) and my LLM now is 100% accurate thanks to the man pages so today is test all but all features then I'll make a codeberg or git account
This entry was edited (1 day ago)
in reply to plan-A ᕦ(Γ²_Γ³Λ‡)α•€

@unusnemo Really neat. Limiting an AI to your validated library of info is ultra smart. I tried something in this direction by grabbing manual pages for e.g. ssh commands by writing a rust application using ratatui called ssh-key-manager which is in within my flake in my repo. It was hard to do and just an experiment in using claude AI for writing hard rust stuff. But it was amazing to see how to parallelise it to make the nixos flake build faster.
in reply to adingbatponder

@adingbatponder @Unus Nemo
I've made some modification, removed the stupid emoticon bar for ASCII art and special symbols widget. I adapted a better welcome as info screen. and if you wondered as you can't see the pictures I posted if you'd ask yourself what is it good for, well it respond in the bot answered by the LLM in an answer console. Or you can just use your LLM as the files you will scrape will be in /opt/llmfeed/ a location your LLM can access.
ibb.co/nsgCVk59
ibb.co/Fb3rZVnw
ibb.co/DfbccdSN

edit: Your local LLM (running via llama.cpp at 127.0.0.1:8080) only sees what your Python script (llm_bot.py) sends it in the prompt field of the JSON request.

you select a .txt file in the β€œβ“ Ask LLM” tab (e.g., /opt/llmfeed/man/git.txt)
Your llm_bot.py script reads that file from disk:

Also avoid to scrape from some sites as Cisco docs, ComPtia + network and the likes you will get a 403 and possibly your ip banned.

This entry was edited (1 day ago)
in reply to plan-A ᕦ(Γ²_Γ³Λ‡)α•€

This post looks very useful. I made a similar tool (still in beta with errors) to grab screenshots and raw data of manual pages #manpages for #linux commands, in this case #ssh info, and to present the different layers that are the OS, app and nixos settings. The program ssh-key-manager is in my flake repoducible.org ( features/security/packages #flake #part can be pulled out into another flake) & bit of a monster. Yours looks ultra compact #nixos
#flakepart #flakeparts
This entry was edited (1 day ago)
in reply to adingbatponder

@adingbatponder Thanks! Your's looks cool and is handy and well structured. The grab from HTTP to .txt as it must be text for my LLM I don't know if other models works the same was a simple command AFTER coding the script and It helped me keep it compact and no telemetry as local, The idea came from @Unus Nemo
a possibility to train models by downloading the knowledge and he added the bot way as an other option. He pointed me earlier as well to self hosting a LLM> he showed the door. So part of credit goes to him though
in reply to adingbatponder

@adingbatponder Thanks! Your's looks cool and is handy and well structured. The grab from HTTP to .txt as it must be text for my LLM I don't know if other models works the same was a simple command AFTER coding the script and It helped me keep it compact and no telemetry as local, The idea came from @Unus Nemo
a possibility to train models by downloading the knowledge and he added the bot way as an other option. He pointed me earlier as well to self hosting a LLM> he showed the door. So part of credit goes to him though
in reply to adingbatponder

@adingbatponder

Most Apps are geared toward Mastodon not Friendica, so you will have issues with it when accessing Friendica Instances, in my experience.

@plan-A ᕦ(Γ²_Γ³Λ‡)α•€

Put your code on github so that your friend can download it from there, play nice πŸ˜‰.

in reply to Unus Nemo

@Unus Nemo @adingbatponder Testing 1st day each widget to the detail, re make a github page as I'm new in dev I used it before for other purposes way long ago, the coding were limited to tool automation, Scilla long ago, Bash, python and the tools commands using cheat cheats, in general network commands...
Thing is it's a rinse repeat process of trial and failure and by modifying it that much it learns me a lot of things (the LLM that helped me)
But I will certainly do as I made a lot of projects sitting here. I'll open source them, that's the spirit! ; )
This entry was edited (1 day ago)
in reply to plan-A ᕦ(Γ²_Γ³Λ‡)α•€

@unusnemo Public repo is a big step and worth doing when it works nicely but this in my experience can take many weeks! I think the idea has been shown and is great.




Well, I guess it's time I give up on Tracking Token Disrespector for disrespecting Flipboard, which is a legit news aggregator.


⇧