That grand, spacious (and long ignored) branch of computing that I never really found a reason to graduate in (big mistake I only realized 30 years later) and that, these days, deals not just with thinking machines (so-called “hard” AI), but mostly with enhancing the usefulness of machines in general by exploring mechanisms to express and manipulate harvested knowledge.
Or just pretty pictures and weird chatbots. Either way, it seems we’re currently between AI winters, with a lot of investment in GPU hardware that still needs to prove itself consistently, measurably useful in real life use cases.
Resources
Field | Category | Date | Link | Notes |
---|---|---|---|---|
Tools | 2024 | amd_inference | a tool that enables inference on AMD GPUs |
|
General | Examples | flash-attention-minimal | a minimal implementation of Flash Attention |
|
Frameworks | tinygrad | A 8000 LOC deep learning framework |
||
2023 | marvin | A generic wrapper for various AI APIs |
||
MLX | An array framework for Apple silicon |
|||
mlx-examples | MLX examples |
|||
Jupyter | jupyter-ai | an official Jupyter plugin that can handle multiple AI back-ends (although it seems less flexible than the others right now) |
||
Libraries | unstructured | a library for handling and segmenting unstructured data of various kinds, from text to common file formats |
||
ml-ane-transformers | Apple’s transformers library, optimized for the Neural Engine |
|||
2009 | Alchemy | A toolkit providing a series of algorithms for statistical relational learning and probabilistic logic inference, based on Markov logic representation |
||
Networking | 2024 | scuda | a GPU over IP bridge that allows remote GPUs to be utilized by CPU-only machines |
|
Tools | scrape-it-now | A flexible scraper implemented in Azure |
||
treescope | An interactive tensor visualizer for IPython notebooks. |
|||
pico-tflmicro | a port of TensorFlow Lite Micro to the Raspberry Pi Pico |
|||
upscayl | an AI-based image upscaler |
|||
2023 | basaran | An Open-Source alternative to the OpenAI text completion API, with a compatible streaming API for privately hosted models. |
||
explainerdashboard | a web app that explains the workings of a (scikit-learn compatible) machine learning model |
|||
NVIDIA Triton Inference Server | A high-performance inference server |
|||
2022 | Project Bumblebee | Pre-trained and transformer neural models in Elixir. |
||
Vector Databases | 2024 | txtai | embeddings database for semantic search |
|
tinkerbird | Vector Database atop IndexedDB |
|||
Generative Audio | models | 2023 | bark | a text-prompted genereative audio model |
Large Language Models | Agents | 2024 | crewAI | a framework for orchestrating autonomous AI agents |
experts | yet another agent tool, this time using the OpenAI assistant interface |
|||
SWE-agent | a prototype GitHub issue bot |
|||
2023 | SuperAGI | another AutoGPT-like harness for building GPT agents |
||
microagents | an interesting experiment on self-editing agents |
|||
llm_agents | a simplified agent framework (doesn’t use OpenAI functions) |
|||
Algorithms | 2024 | word-embedding | an implementation of word2vec skip-gram for word embedding. |
|
Applications | 2023 | Chie | A cross-platform dekstop application with chat history and extension support |
|
Copilots | Obsidian Copilot | an interesting take on how to use semantic search and OpenSearch’s BM25 implementation |
||
Demos | 2024 | WhisperFusion | an ensemble setup with WhisperSpeech, WhisperLive and Phi |
|
Frameworks | 2023 | litellm | a simple, lightweight LLM wrapper |
|
AutoChain | Yet another alternative to langchain |
|||
Tanuki | yet another LLM framework using decorators for data validation |
|||
griptape | a |
|||
guidance | Control modern language models more effectively and efficiently than traditional prompting or chaining. |
|||
langchain | a composable approach for building LLM applications |
|||
llama_index | a data framework for LLM applications |
|||
txtai | has spinoffs for chat, workflows for medical/scientific papers, semantic search for developers and semantic search for headlines and story text |
|||
llmflows | Yet another alternative to langchain, but with an interesting approach at defining workflows |
|||
fabric | a componentized approach to building LLM pipelines |
|||
Front-Ends | 2024 | jan | an open-source ChatGPT alternative that runs 100% offline (uses nitro) |
|
2023 | SecureAI-Tools | a self-hosted local inference front-end for chatting with document collections |
||
gpt4all | another self-hosted local inference front-end |
|||
Jupyter | LLMBook | A VS Code notebook interface for LLMs |
||
jupytee | a Jupyter plugin that can handle code generation and image generation, but not switching models (GPT-4) |
|||
genai | a Jupyter plugin that can handle code generation and fixes based on tracebacks |
|||
ipython-gpt | a Jupyter plugin that can handle multiple models |
|||
Libraries | 2024 | chonkie | a lightweight library for efficient text chunking in RAG applications. |
|
databonsai | a Python library that uses LLMs to perform data cleaning |
|||
DataDreamer | library for prompting, synthetic data generation, and training workflows |
|||
radients | a vactorization library that can handle more than just text |
|||
2023 | guardrails | a package for validating and correcting the outputs of large language models |
||
MemGPT | a memory management/summarization technique for unbounded context |
|||
instructor | a clever library that simplifies invoking OpenAI function calls |
|||
simpleaichat | A simple wrapper for the ChatGPT AI |
|||
Models | 2024 | functionary | can interpret and execute functions/plugins |
|
TinyLlama | pretraining of a 1.1B Llama model on 3 trillion tokens. |
|||
Octopus-v2 | a model designed for both function calling and on-device inference |
|||
2023 | ml-ferret | a multi-modal model from Apple |
||
turbopilot | a GitHub CoPilot replacement that can run locally (CPU only) |
|||
Reference | 2024 | sqlite-hybrid-search | an example of how to do hyrid (vector and FTS) search with SQLite for RAG |
|
2023 | Native JSON Output from GPT-4 | tips on how to use OpenAI JSON and function calling |
||
Using LLaMA with M1 Mac | Manual instructions for Apple Silicon |
|||
Prompt Engineering Guide | a set of lecture notes and detailed examples of prompting techniques |
|||
awesome-decentralized-llm | a collection of LLM resources that operate independently |
|||
GPT Prompt Archive | A set of sample base prompts for various LLMs |
|||
promptbase | Another set of prompting techniques and detailed examples |
|||
2022 | awesome-chatgpt-prompts | might be a short-lived resource, but an interesting one |
||
Samples | 2024 | SimpleTinyLlama | a simple PyTorch-based implementation |
|
devlooper | a program synthesis agent that autonomously fixes its output by running tests |
|||
2023 | gpt-researcher | a simple agent that does online research on any given topic |
||
David Attenborough narrates your life | A pretty hilarious image-to-description example |
|||
LibreChat | A self-hosted ChatGPT alternative |
|||
sharepoint-indexing-azure-cognitive-search | provides an example of how to use Graph navigation and Cognitive Search indexing |
|||
gpt4all | open-source LLM chatbots |
|||
Demystifying Advanced RAG Pipelines | An LLM-powered advanced RAG pipeline built from scratch |
|||
Wanderlust OpenAI example using Solara | A simple interactive web shell with some nice features |
|||
GPT in 60 Lines of NumPy | a tutorial on how to build a GPT model from scratch |
|||
Bash One-Liners for LLMs | a collection of one-liners for various LLMs |
|||
Tools | 2024 | ollama-bot | a rudimentary IRC bot that communicates with a local instance of ollama |
|
burr | a tool for creating and managing LLM workflows |
|||
geppetto | a bot for integrating ChatGPT and DALL-E into Slack |
|||
Perplexica | a Perplexity AI search engine clone |
|||
koboldcpp | nn easy-to-use AI text-generation software for GGML and GGUF models based on llama.cpp |
|||
GPTFast | a set of acceleration techniques |
|||
pico-cookbook | Recipes for on-device voice AI and local LLM |
|||
gpt-pilot | a prototype development tool that leverages GPT |
|||
R2R | a framework for or rapid development and deployment of production-ready RAG systems with SQLite support |
|||
notesollama | a plugin for Apple Notes that uses the Accessibility APIs |
|||
gguf-tools | a set of tools for manipulating GGUF format files |
|||
local-image-gen | A GPTScript tool to generate images |
|||
WordLlama | a lightweight NLP toolkit for tasks like fuzzy-deduplication, similarity, and ranking |
|||
exo | an intriguing P2P clustering solution for running models across several machines |
|||
oterm | a terminal-based interface for LLMs |
|||
GPTScript | Natural Language Programming against multiple LLMs |
|||
llm-ls | a local language server that leverages LLMs |
|||
llm-vscode | a VSCode extension that uses llm-ls |
|||
ipex-llm | a PyTorch extension for Intel hardware |
|||
nitro | a self-hosted inference engine for edge computing with an OpenAI API |
|||
emacs-copilot | an Emacs extension for using a local LLM |
|||
llm.c | LLM training in simple, raw C/CUDA |
|||
dify | an open-source LLM app development platform with a node-based UX |
|||
llmware | a framework for developing LLM-based applications including Retrieval Augmented Generation |
|||
LLMLingua | a tool for compressing prompts with minimal loss of information |
|||
TinyTroupe | a multiagent persona simulation |
|||
genaiscript | a JavaScript environment for prompt development and structured data extraction for LLMs. |
|||
GraphRAG | a data pipeline designed to pre-process knowledge graphs and perform RAG on them |
|||
lida | automatic generation of visualizations and infographics |
|||
hqq | an implementation of Half-Quadratic Quantization (HQQ) |
|||
LLocalSearch | a local tool for searching using LLMs |
|||
nlm-ingestor | a set of parsers for common file formats |
|||
open-webui | a web-based interface for LLMs |
|||
pipecat | yet another LLM agent framework |
|||
plandex | yet another long-running agent tool for complex coding tasks |
|||
korvus | a search SDK that unifies the entire RAG pipeline in a single database query |
|||
lorax | a framework that allows users to serve thousands of fine-tuned models on a single GPU |
|||
mark | CLI to interact with LLMs using markdown and images |
|||
reor | a note taking tool that performs RAG using a local LLM |
|||
privy | An open-source alternative to GitHub copilot that runs locally. |
|||
storm | a tool that researches a topic and generates a full-length report with citations |
|||
fably | A device that tells bedtime stories to kids, using chunked TTS |
|||
NeuralFlow | a Python script for plotting the intermediate layer outputs of Mistral 7B |
|||
2023 | pykoi | a unified interface for data and feedback collection, including model comparisons |
||
Auto-GPT | an attempt to provide ChatGPT with a degree of autonomy |
|||
a1gpt | A C++ implementation of a GPT-2 inference engine |
|||
BricksLLM | an OpenAI gateway in Go to create API keys with rate limits, cost limits and TTLs |
|||
dalai | An automated installer for LLaMA |
|||
localpilot | a MITM proxy that lets you use the GitHub Copilot extension with other LLMs |
|||
macOSpilot-ai-assistant | An Electron app for macOS |
|||
embedchain | another framework to create bots from existing datasets |
|||
llama.cpp | A C++ port of Facebook’s LLaMA model. Still requires roughly 240GB of (unoptimized) weights, but can run on a 64GB Mac. |
|||
PromptTools | self-hostable toools for evaluating LLMs, vector databases, and prompts |
|||
ChainForge | a visual programming environment for benchmarking prompts across multiple LLMs |
|||
khoj | an intriguing personal assistant based on local data |
|||
minillm | A GPU-focused Python wrapper for LLaMa |
|||
langflow | a node-based GUI for quick iteration of langchain flows |
|||
simple-llama-finetuner | A way to do LoRA adaptation of LLaMa |
|||
chatbot-ui | a more or less sensibly designed self-hosted ChatGPT UI |
|||
TinyChatEngine | A local (edge) inference engine in C++ without any dependencies |
|||
content-chatbot | A way to quickly create custom embeddings off a web site |
|||
LocalAI | A local, drop-in replacement for the OpenAI API |
|||
chatblade | a CLI wrapper for ChatGPT |
|||
promptfoo | A tool for testing and evaluating LLM prompt quality. |
|||
GPTQ-for-LLaMa | a way to quantize the LLaMA weights to 4-bit precision |
|||
Serve | A containerized solution for using local LLMs via web chat |
|||
llama-rs | A Rust port of llama.cpp |
|||
alpaca-lora | Another way to do LoRA adaptation of LLaMa |
|||
wyGPT | another C++ local inference tool |
|||
Vector Databases | chroma | an embedding database |
||
vectordb | A simple vector database that can run in-process |
|||
marqo | A vector database that performs vector generation internally |
|||
USearch | A Single-File Vector Search Engine |
|||
Workflows | danswer | a pretty complete GPT/search integration solution with GitHub, Slack and Confluence/JIRA connectors |
||
Multi-modal Models | Samples | 2024 | ml-mgie | instruction-based image self-editing |
Multimodal Models | Libraries | zerox | a library that performs OCR on documents and converts them to Markdown |
|
Models | Hybrid-Net | Real-time audio to chords, lyrics, beat, and melody. |
||
Tools | swift-ocr-llm-powered-pdf-to-markdown | a tool that processes PDF files into structured Markdown format |
||
NeRFs | 2022 | nerfstudio | A tool for manipulating Neural Radiance Fields (NeRF) and rendering the scenes out as video |
|
Speech Recognition | Models | 2024 | WhisperLive | a real-time text-to-speech system based on Whisper |
moonshine | a family of models optimized for resource-constrained devices. |
|||
a library optimized for fast and accurate automatic speech recognition on resource-constrained devices. |
||||
2023 | distil-whisper | a distilled version of whisper that is 6 times faster |
||
2022 | whisper.cpp | a C++ implementation of whisper that can run in consumer hardware |
||
whisper | a general purpose speech recognition model |
|||
Tools | 2024 | audapolis | an editor for spoken-word audio with automatic transcription |
|
2023 | insanely-fast-whisper | An opinionated CLI for audio transcription |
||
Speech Synthesis | Libraries | 2024 | MeloTTS | a multi-lingual text-to-speech library with support for various languages and real-time inference. |
Models | ChatTTS | a text-to-speech model designed specifically for dialogue scenarios, with decent prosody |
||
Real-Time-Voice-Cloning | a PyTorch implementation of a voice cloning model |
|||
WhisperSpeech | a text-to-speech system built by inverting Whisper |
|||
2023 | StyleTTS2 | A text to speech model that supports style diffusion |
||
Tools | 2024 | OpenVoice | a tool that enables accurate voice cloning with multi-lingual support and flexible style control. |
|
Voice Cloning | a minimal sampling approach |
|||
Stable Diffusion | Apps | 2023 | swift-coreml-diffusers | Hugging face’s own app, using Swift and CoreML for Apple Silicon |
2022 | Draw Things | Pre-packaged app for iOS, downloads and allows re-use of .ckpt files. |
||
DiffusionBee | Pre-packaged app for macOS (M1 and Intel) |
|||
CGI | 2023 | Blender-ControlNet | A Blender plugin to generate ControlNet inputs for posing figures |
|
2022 | dream-textures | A Blender plugin for texturing models based on a text description. |
||
Implementations | 2023 | OnnxStream | Stable Diffusion XL 1.0 Base on a Raspberry Pi Zero 2 (or in 298MB of RAM) |
|
Libraries | 2024 | sd4j | a Java library for Stable Diffusion that uses ONNX |
|
Models | SDXL-Lightning | an SDXL flavor that works on only a few steps |
||
2023 | Upscale Model Database | Too wide a choice, perhaps |
||
2022 | Fast Stable Diffusion | Another tactic to accelerate inference |
||
CoreML Stable Diffusion | Apple’s optimizations for CoreML |
|||
Reference | 2024 | comflowy | a set of reference workflows and documentation for ComfyUI |
|
flux | minimal inference examples for FLUX.1 models |
|||
Tools | comflowyspace | a ComfyUI desktop wrapper |
||
2023 | ComfyUI-AnimateDiff-Evolved | An AnimateDiff integration for ComfyUI |
||
ComfyUI | pretty impressive node-based UI |
|||
InvokeAI | A polished UI |
|||
stable-diffusion.cpp | stable diffusion inference on the CPU, in pure C++ |
|||
ComfyUI-Manager | A component manager for ComfyUI |
|||
Opendream | A layer-oriented, non-destructive editor |
|||
2022 | Stable Diffusion WebUI | Nearly always the best, bleeding edge WebUI for SD |
||
imaginAIry | Works well on Apple Silicon, pure CLI interface to all SD models. Does not reuse .ckpt files, however, so requires separate disk cache. |
|||
Vision | 2024 | machina | a CCTV viewer that connects to RTSP streams and performs real-time object tagging using YOLO and ollama |