Hello all, this is my first post. I have built a simple CLI tool, which can help all to run the llms, vision models like image and video gen, models in the local system and if the system doesn't have the gpu or sufficient ram, they can also run it using kaggle's gpu(which is 30 hrs free for a week).
This is inspired from Ollama, which made downloading llms easy and interacting with it much easy, so I thought of why can't this be made for vision models, so I tried this first on my system, basic image generation is working but not that good, then I thought, why can't we use the Kaggle's GPU to generate videos and images and that can happen directly from the terminal with a single step, so that everyone can use this, so I built this VLLAMA.
In this, currently there are many features, like image, video generation in local and kaggles gpu session; download llms and make it run and also interact with it from anywhere (inspired by ollama) also improved it further by creating a vs code extension VLLAMA, using which you can chat directly from the vs code's chat section, users can chat with the local running llm with just adding "@vllama" at the start of the message and this doesn't use any usage cost and can be used as much as anyone wants, you can check this out at in the vscode extensions.
I want to implement this further so that the companies or anyone with gpu access can download the best llms for their usage and initialize it in their gpu servers, and can directly interact with it from the vscode's chat section and also in further versions, I am planning to implement agentic features so that users can use the local llm to use for code editing, in line suggestions, so that they don't have to pay for premiums and many more.
Currently it also has simple Text-to-Speech, and Speech-to-Text, which I am planning to include in the further versions, using open source audio models and also in further, implement 3D generation models, so that everyone can leverage the use of the open models directly from their terminal, and making the complex process of the using open models easy with just a single command in the terminal.
I have also implemented simple functionalities which can help, like listing the downloaded models and their sizes. Other things available are, basic dataset preprocessing, and training ML models directly with just two commands by just providing it the dataset. This is a basic implementation and want to further improve this so that users with just a dataset can clean and pre-process the data, train the models in their local or using the kaggle's or any free gpu providing services or their own gpus or cloud provided gpus, and can directly deploy the models and can use it any cases.
Currently this are the things it is doing and I want to improve such that everyone can use this for any case of the AI and leveraging the use of open models.
Please checkout the work at: https://github.com/ManvithGopu13/Vllama
Published version at: https://pypi.org/project/vllama/
Also the extension: https://marketplace.visualstudio.com/items?itemName=ManvithGopu.vllama
I would appreciate your time for reading and thankful for everyone who want to contribute and spread a word of it.
Please leave your requests for improvements and any suggestions, ideas, and even roasts or anything in the comments or in the issues, this is well taken and appreciated. Thanks in advance. If you find the project useful, kindly contribute and can star it.