Artificial Intelligence

That grand, spacious (and long ignored) branch of computing that I never really found a reason to graduate in (big mistake I only realized 30 years later) and that, these days, deals not just with thinking machines (so-called “hard” AI), but mostly with enhancing the usefulness of machines in general by exploring mechanisms to express and manipulate harvested knowledge.

Or just pretty pictures and weird chatbots. Either way, it seems we’re currently between AI winters, with a lot of investment in GPU hardware that still needs to prove itself consistently, measurably useful in real life use cases.

Resources

Field Category Date Link Notes
Tools 2024 amd_inference

a tool that enables inference on AMD GPUs

General Examples flash-attention-minimal

a minimal implementation of Flash Attention

Frameworks tinygrad

A 8000 LOC deep learning framework

2023 marvin

A generic wrapper for various AI APIs

MLX

An array framework for Apple silicon

mlx-examples

MLX examples

Jupyter jupyter-ai

an official Jupyter plugin that can handle multiple AI back-ends (although it seems less flexible than the others right now)

Libraries unstructured

a library for handling and segmenting unstructured data of various kinds, from text to common file formats

ml-ane-transformers

Apple’s transformers library, optimized for the Neural Engine

2009 Alchemy

A toolkit providing a series of algorithms for statistical relational learning and probabilistic logic inference, based on Markov logic representation

Networking 2024 scuda

a GPU over IP bridge that allows remote GPUs to be utilized by CPU-only machines

Tools scrape-it-now

A flexible scraper implemented in Azure

treescope

An interactive tensor visualizer for IPython notebooks.

pico-tflmicro

a port of TensorFlow Lite Micro to the Raspberry Pi Pico

upscayl

an AI-based image upscaler

2023 basaran

An Open-Source alternative to the OpenAI text completion API, with a compatible streaming API for privately hosted models.

explainerdashboard

a web app that explains the workings of a (scikit-learn compatible) machine learning model

NVIDIA Triton Inference Server

A high-performance inference server

2022 Project Bumblebee

Pre-trained and transformer neural models in Elixir.

Vector Databases 2024 txtai

embeddings database for semantic search

tinkerbird

Vector Database atop IndexedDB

Generative Audio models 2023 bark

a text-prompted genereative audio model

Large Language Models Agents 2024 crewAI

a framework for orchestrating autonomous AI agents

experts

yet another agent tool, this time using the OpenAI assistant interface

SWE-agent

a prototype GitHub issue bot

2023 SuperAGI

another AutoGPT-like harness for building GPT agents

microagents

an interesting experiment on self-editing agents

llm_agents

a simplified agent framework (doesn’t use OpenAI functions)

Algorithms 2024 word-embedding

an implementation of word2vec skip-gram for word embedding.

Applications 2023 Chie

A cross-platform dekstop application with chat history and extension support

Copilots Obsidian Copilot

an interesting take on how to use semantic search and OpenSearch’s BM25 implementation

Demos 2024 WhisperFusion

an ensemble setup with WhisperSpeech, WhisperLive and Phi

Frameworks 2023 litellm

a simple, lightweight LLM wrapper

AutoChain

Yet another alternative to langchain

Tanuki

yet another LLM framework using decorators for data validation

griptape

a langchain alternative with slighly better internal coding standards

guidance

Control modern language models more effectively and efficiently than traditional prompting or chaining.

langchain

a composable approach for building LLM applications

llama_index

a data framework for LLM applications

txtai

has spinoffs for chat, workflows for medical/scientific papers, semantic search for developers and semantic search for headlines and story text

llmflows

Yet another alternative to langchain, but with an interesting approach at defining workflows

fabric

a componentized approach to building LLM pipelines

Front-Ends 2024 jan

an open-source ChatGPT alternative that runs 100% offline (uses nitro)

2023 SecureAI-Tools

a self-hosted local inference front-end for chatting with document collections

gpt4all

another self-hosted local inference front-end

Jupyter LLMBook

A VS Code notebook interface for LLMs

jupytee

a Jupyter plugin that can handle code generation and image generation, but not switching models (GPT-4)

genai

a Jupyter plugin that can handle code generation and fixes based on tracebacks

ipython-gpt

a Jupyter plugin that can handle multiple models

Libraries 2024 chonkie

a lightweight library for efficient text chunking in RAG applications.

databonsai

a Python library that uses LLMs to perform data cleaning

DataDreamer

library for prompting, synthetic data generation, and training workflows

radients

a vactorization library that can handle more than just text

magentic

decorators to create functions that return structured output from an LLM.

2023 guardrails

a package for validating and correcting the outputs of large language models

MemGPT

a memory management/summarization technique for unbounded context

instructor

a clever library that simplifies invoking OpenAI function calls

simpleaichat

A simple wrapper for the ChatGPT AI

Models 2024 functionary

can interpret and execute functions/plugins

TinyLlama

pretraining of a 1.1B Llama model on 3 trillion tokens.

Octopus-v2

a model designed for both function calling and on-device inference

2023 ml-ferret

a multi-modal model from Apple

turbopilot

a GitHub CoPilot replacement that can run locally (CPU only)

Reference 2024 sqlite-hybrid-search

an example of how to do hyrid (vector and FTS) search with SQLite for RAG

2023 Native JSON Output from GPT-4

tips on how to use OpenAI JSON and function calling

Using LLaMA with M1 Mac

Manual instructions for Apple Silicon

Prompt Engineering Guide

a set of lecture notes and detailed examples of prompting techniques

awesome-decentralized-llm

a collection of LLM resources that operate independently

GPT Prompt Archive

A set of sample base prompts for various LLMs

promptbase

Another set of prompting techniques and detailed examples

2022 awesome-chatgpt-prompts

might be a short-lived resource, but an interesting one

Samples 2024 SimpleTinyLlama

a simple PyTorch-based implementation

devlooper

a program synthesis agent that autonomously fixes its output by running tests

2023 gpt-researcher

a simple agent that does online research on any given topic

David Attenborough narrates your life

A pretty hilarious image-to-description example

LibreChat

A self-hosted ChatGPT alternative

sharepoint-indexing-azure-cognitive-search

provides an example of how to use Graph navigation and Cognitive Search indexing

gpt4all

open-source LLM chatbots

Demystifying Advanced RAG Pipelines

An LLM-powered advanced RAG pipeline built from scratch

Wanderlust OpenAI example using Solara

A simple interactive web shell with some nice features

GPT in 60 Lines of NumPy

a tutorial on how to build a GPT model from scratch

Bash One-Liners for LLMs

a collection of one-liners for various LLMs

Tools 2024 ollama-bot

a rudimentary IRC bot that communicates with a local instance of ollama

burr

a tool for creating and managing LLM workflows

geppetto

a bot for integrating ChatGPT and DALL-E into Slack

Perplexica

a Perplexity AI search engine clone

koboldcpp

nn easy-to-use AI text-generation software for GGML and GGUF models based on llama.cpp

GPTFast

a set of acceleration techniques

pico-cookbook

Recipes for on-device voice AI and local LLM

gpt-pilot

a prototype development tool that leverages GPT

R2R

a framework for or rapid development and deployment of production-ready RAG systems with SQLite support

notesollama

a plugin for Apple Notes that uses the Accessibility APIs

gguf-tools

a set of tools for manipulating GGUF format files

local-image-gen

A GPTScript tool to generate images

WordLlama

a lightweight NLP toolkit for tasks like fuzzy-deduplication, similarity, and ranking

exo

an intriguing P2P clustering solution for running models across several machines

oterm

a terminal-based interface for LLMs

GPTScript

Natural Language Programming against multiple LLMs

llm-ls

a local language server that leverages LLMs

llm-vscode

a VSCode extension that uses llm-ls

ipex-llm

a PyTorch extension for Intel hardware

nitro

a self-hosted inference engine for edge computing with an OpenAI API

emacs-copilot

an Emacs extension for using a local LLM

llm.c

LLM training in simple, raw C/CUDA

dify

an open-source LLM app development platform with a node-based UX

llmware

a framework for developing LLM-based applications including Retrieval Augmented Generation

LLMLingua

a tool for compressing prompts with minimal loss of information

TinyTroupe

a multiagent persona simulation

genaiscript

a JavaScript environment for prompt development and structured data extraction for LLMs.

GraphRAG

a data pipeline designed to pre-process knowledge graphs and perform RAG on them

lida

automatic generation of visualizations and infographics

hqq

an implementation of Half-Quadratic Quantization (HQQ)

LLocalSearch

a local tool for searching using LLMs

nlm-ingestor

a set of parsers for common file formats

open-webui

a web-based interface for LLMs

pipecat

yet another LLM agent framework

plandex

yet another long-running agent tool for complex coding tasks

korvus

a search SDK that unifies the entire RAG pipeline in a single database query

lorax

a framework that allows users to serve thousands of fine-tuned models on a single GPU

mark

CLI to interact with LLMs using markdown and images

reor

a note taking tool that performs RAG using a local LLM

privy

An open-source alternative to GitHub copilot that runs locally.

storm

a tool that researches a topic and generates a full-length report with citations

fably

A device that tells bedtime stories to kids, using chunked TTS

NeuralFlow

a Python script for plotting the intermediate layer outputs of Mistral 7B

2023 pykoi

a unified interface for data and feedback collection, including model comparisons

Auto-GPT

an attempt to provide ChatGPT with a degree of autonomy

a1gpt

A C++ implementation of a GPT-2 inference engine

BricksLLM

an OpenAI gateway in Go to create API keys with rate limits, cost limits and TTLs

dalai

An automated installer for LLaMA

localpilot

a MITM proxy that lets you use the GitHub Copilot extension with other LLMs

macOSpilot-ai-assistant

An Electron app for macOS

embedchain

another framework to create bots from existing datasets

llama.cpp

A C++ port of Facebook’s LLaMA model. Still requires roughly 240GB of (unoptimized) weights, but can run on a 64GB Mac.

PromptTools

self-hostable toools for evaluating LLMs, vector databases, and prompts

ChainForge

a visual programming environment for benchmarking prompts across multiple LLMs

khoj

an intriguing personal assistant based on local data

minillm

A GPU-focused Python wrapper for LLaMa

langflow

a node-based GUI for quick iteration of langchain flows

simple-llama-finetuner

A way to do LoRA adaptation of LLaMa

chatbot-ui

a more or less sensibly designed self-hosted ChatGPT UI

TinyChatEngine

A local (edge) inference engine in C++ without any dependencies

content-chatbot

A way to quickly create custom embeddings off a web site

LocalAI

A local, drop-in replacement for the OpenAI API

chatblade

a CLI wrapper for ChatGPT

promptfoo

A tool for testing and evaluating LLM prompt quality.

GPTQ-for-LLaMa

a way to quantize the LLaMA weights to 4-bit precision

Serve

A containerized solution for using local LLMs via web chat

llama-rs

A Rust port of llama.cpp

alpaca-lora

Another way to do LoRA adaptation of LLaMa

wyGPT

another C++ local inference tool

Vector Databases chroma

an embedding database

vectordb

A simple vector database that can run in-process

marqo

A vector database that performs vector generation internally

USearch

A Single-File Vector Search Engine

Workflows danswer

a pretty complete GPT/search integration solution with GitHub, Slack and Confluence/JIRA connectors

Multi-modal Models Samples 2024 ml-mgie

instruction-based image self-editing

Multimodal Models Libraries zerox

a library that performs OCR on documents and converts them to Markdown

Models Hybrid-Net

Real-time audio to chords, lyrics, beat, and melody.

Tools swift-ocr-llm-powered-pdf-to-markdown

a tool that processes PDF files into structured Markdown format

NeRFs 2022 nerfstudio

A tool for manipulating Neural Radiance Fields (NeRF) and rendering the scenes out as video

Speech Recognition Models 2024 WhisperLive

a real-time text-to-speech system based on Whisper

moonshine

a family of models optimized for resource-constrained devices.

a library optimized for fast and accurate automatic speech recognition on resource-constrained devices.

2023 distil-whisper

a distilled version of whisper that is 6 times faster

2022 whisper.cpp

a C++ implementation of whisper that can run in consumer hardware

whisper

a general purpose speech recognition model

Tools 2024 audapolis

an editor for spoken-word audio with automatic transcription

2023 insanely-fast-whisper

An opinionated CLI for audio transcription

Speech Synthesis Libraries 2024 MeloTTS

a multi-lingual text-to-speech library with support for various languages and real-time inference.

Models ChatTTS

a text-to-speech model designed specifically for dialogue scenarios, with decent prosody

Real-Time-Voice-Cloning

a PyTorch implementation of a voice cloning model

WhisperSpeech

a text-to-speech system built by inverting Whisper

2023 StyleTTS2

A text to speech model that supports style diffusion

Tools 2024 OpenVoice

a tool that enables accurate voice cloning with multi-lingual support and flexible style control.

Voice Cloning

a minimal sampling approach

Stable Diffusion Apps 2023 swift-coreml-diffusers

Hugging face’s own app, using Swift and CoreML for Apple Silicon

2022 Draw Things

Pre-packaged app for iOS, downloads and allows re-use of .ckpt files.

DiffusionBee

Pre-packaged app for macOS (M1 and Intel)

CGI 2023 Blender-ControlNet

A Blender plugin to generate ControlNet inputs for posing figures

2022 dream-textures

A Blender plugin for texturing models based on a text description.

Implementations 2023 OnnxStream

Stable Diffusion XL 1.0 Base on a Raspberry Pi Zero 2 (or in 298MB of RAM)

Libraries 2024 sd4j

a Java library for Stable Diffusion that uses ONNX

Models SDXL-Lightning

an SDXL flavor that works on only a few steps

2023 Upscale Model Database

Too wide a choice, perhaps

2022 Fast Stable Diffusion

Another tactic to accelerate inference

CoreML Stable Diffusion

Apple’s optimizations for CoreML

Reference 2024 comflowy

a set of reference workflows and documentation for ComfyUI

flux

minimal inference examples for FLUX.1 models

Tools comflowyspace

a ComfyUI desktop wrapper

NitroFusion

a high-fidelity, fast (single-step) SDXL diffusion model

2023 ComfyUI-AnimateDiff-Evolved

An AnimateDiff integration for ComfyUI

ComfyUI

pretty impressive node-based UI

InvokeAI

A polished UI

stable-diffusion.cpp

stable diffusion inference on the CPU, in pure C++

ComfyUI-Manager

A component manager for ComfyUI

Opendream

A layer-oriented, non-destructive editor

2022 Stable Diffusion WebUI

Nearly always the best, bleeding edge WebUI for SD

imaginAIry

Works well on Apple Silicon, pure CLI interface to all SD models. Does not reuse .ckpt files, however, so requires separate disk cache.

Video Generation Models 2024 HunyuanVideo

A pretty impressive open source video generation model

Vision Tools machina

a CCTV viewer that connects to RTSP streams and performs real-time object tagging using YOLO and ollama

This page is referenced in: