Librechat: Difference between revisions
| Line 79: | Line 79: | ||
'''Cons''': As a cloud model, your prompts are sent to OpenAI's servers. Response times can vary based on server load. | '''Cons''': As a cloud model, your prompts are sent to OpenAI's servers. Response times can vary based on server load. | ||
=== | === Google Gemini 2.5 Pro === | ||
'''Best for''': Long-context tasks, multilingual work, math competitions, and strong all-around coding and reasoning. | |||
'''Target majors''': Computer Science, Mathematics, Foreign Languages, Data Science, any field requiring processing very long documents. | |||
'''Pros''': Boasts a massive native context window (up to 1M tokens). Excels at math (AIME 2025: 86.7%, AIME 2024: 92.0%) and has strong coding chops (SWE-Bench Verified: 63.8%, LiveCodeBench v5: 70.4%). It's also a powerhouse for multilingual tasks (Global MMLU Lite: 89.8%) and long-context retrieval (MRCR 128k: 94.5%). | |||
'''Cons''': As a cloud model, your prompts are sent to Google's servers. Some benchmarks trail behind GPT-5.2 Thinking (e.g., GPQA Diamond: 84.0% vs. 92.4%). | |||
=== Google Gemini 2.5 Flash === | |||
'''Best for''': Fast, cost-effective tasks where speed matters more than peak accuracy. Quick Q&A, summarization, and everyday student use. | |||
'''Target majors''': All majors needing a fast, reliable online assistant for general tasks. | |||
'''Pros''': Significantly faster and cheaper than Gemini 2.5 Pro while still being very capable. Great for high-volume, lower-stakes work. | |||
'''Cons''': Benchmarks are lower than Pro across the board. Not ideal for tasks requiring peak reasoning or coding performance. | |||
== Local Models == | == Local Models == | ||
Revision as of 19:56, 7 April 2026
Navigation
The Interface
When you first log in, you'll see three main areas:
- The chat area (center): Where you type messages and read AI responses.
- The left sidebar: Shows your conversation history, navigation options, and account settings.
- New Chat button: Starts a fresh conversation. Each conversation is saved separately so you can return to it later.
- Conversation History: A list of your past chats, labeled by topic or date. Click any item to reopen that conversation.
- Settings / Profile: Found at the bottom of the sidebar — here you can adjust preferences, view your account info, and manage other options.
- (You can collapse the sidebar by clicking the toggle icon at the top left, giving you more screen space for the chat itself.)
- The right sidebar (side panel): A context panel with tools and settings relevant to your current conversation.
- Agent Builder: Create and configure custom AI agents tailored to specific tasks or workflows. Agents can be given a name, instructions, and specific tools to use.
- Prompts: Access and manage your saved prompt templates. (See the Prompts section below for how to create and use these.)
- Memories: LibreChat can store key information across conversations, creating a persistent knowledge base. The Memories panel lets you view, add, or remove these stored details.
- Parameters: Adjust advanced settings for the current conversation, such as the AI's response style (temperature) and other model-specific options. Useful if you want to experiment with how the model behaves.
- Attach Files: Upload documents or images directly into your conversation so the AI can read and reference them.
- Bookmarks: Save specific messages or conversations you want to return to quickly, like bookmarking a page in a book.
- MCP Settings: MCP (Model Context Protocol) allows the AI to connect to external tools and services. This panel lets you view and configure those connections if they have been set up for your instance.
- Hide Panel: Collapses the right sidebar to give you more screen space. Click the panel toggle to bring it back.
Switching between Models
One of LibreChat's most powerful features is the ability to use different AI models — each with different strengths. To change models:
- At the top of a new or existing conversation, look for the model selector dropdown (it will display the name of the currently active model, such as "GPT-4o" or "Claude 3.5 Sonnet").
- Click on it to open a list of available models.
- Select the model you'd like to use. The change takes effect immediately for any new messages you send.
Saving Prompts and Presets
The Prompts feature lets you save reusable instructions or templates so you don't have to retype them each time. This is especially useful for tasks you do repeatedly, like summarizing meeting notes or drafting a certain kind of email. A Preset saves not just a prompt but also your model choice and advanced settings (like response style or behavior). Think of it as a full "configuration" you can load with one click.
To create a new prompt:
- Click on Prompts in the right side panel.
- Click the "+ New Prompt" or "Create Prompt" button.
- Give your prompt a clear, descriptive title (e.g., "Summarize Meeting Notes").
- Type your prompt text in the body. You can include placeholders using curly braces, like {topic} or {document}, which you can fill in each time you use it.
- Click Save.
To use a saved prompt:
- In the chat message box, type / — a menu of your saved prompts will appear.
- Select the prompt you want. If it has placeholders, you'll be prompted to fill them in before sending.
To create a new preset:
- Set up your conversation the way you like it — choose your model and, optionally, adjust settings via the Parameters panel.
- Open the model/settings panel and look for a "Save as Preset" option.
- Name your preset and save it.
To use a saved preset:
- Click the preset menu at the top of a new conversation.
- Select the preset you saved. Your model and settings will be applied automatically.
Uploading Files
LibreChat allows you to upload documents and images directly into a conversation so the AI can read and respond to them. See #Models for more details on support for different upload formats.
To upload a file:
- Click Attach Files in the right side panel, or look for the paperclip icon in the message input area.
Upload as Textis generally the preferred option for code, text, or CSV files. Some models also supportUpload to Providerfor image formats, PDFs, Word documents, or Spreadsheets.- Select a file from your computer (PDFs, Word documents, images, and text files are commonly supported).
- Once uploaded, you can ask questions about the file or ask the AI to summarize, analyze, or extract information from it.
Sharing a Conversation
If you'd like to share an interesting or useful conversation with a colleague:
- Open the conversation you want to share.
- Look for a Share option in the conversation menu (often represented by three dots ... or a share icon at the top of the chat).
- LibreChat will generate a unique, view-only link you can send to anyone — even people who don't have a LibreChat account.
Temporary/Private Chats
If you're working with sensitive information and don't want a conversation saved to your history, you can use Temporary Chat mode.
- Look for a
Temporary Chattoggle, usually accessible from the new chat options or a menu near the top of the interface. - Conversations in this mode are not stored after you close them.
Models
Remote Models
OpenAI ChatGPT 5.2
Best for: The absolute cutting edge in reasoning, math, coding, and knowledge work. Also very good for writing.
Target majors: Law, Mathematics, Physics, Engineering, Computer Science, Graduate Researchers, anyone needing the highest possible accuracy.
Pros: State-of-the-art across nearly every benchmark. It achieves a perfect 100% on AIME 2025 (competition math) and leads in software engineering (80% SWE-Bench Verified). Its abstract reasoning (ARC-AGI-2: 52.9%) is in a class of its own compared to its predecessors.
Cons: As a cloud model, your prompts are sent to OpenAI's servers. Response times can vary based on server load.
Google Gemini 2.5 Pro
Best for: Long-context tasks, multilingual work, math competitions, and strong all-around coding and reasoning.
Target majors: Computer Science, Mathematics, Foreign Languages, Data Science, any field requiring processing very long documents.
Pros: Boasts a massive native context window (up to 1M tokens). Excels at math (AIME 2025: 86.7%, AIME 2024: 92.0%) and has strong coding chops (SWE-Bench Verified: 63.8%, LiveCodeBench v5: 70.4%). It's also a powerhouse for multilingual tasks (Global MMLU Lite: 89.8%) and long-context retrieval (MRCR 128k: 94.5%).
Cons: As a cloud model, your prompts are sent to Google's servers. Some benchmarks trail behind GPT-5.2 Thinking (e.g., GPQA Diamond: 84.0% vs. 92.4%).
Google Gemini 2.5 Flash
Best for: Fast, cost-effective tasks where speed matters more than peak accuracy. Quick Q&A, summarization, and everyday student use.
Target majors: All majors needing a fast, reliable online assistant for general tasks.
Pros: Significantly faster and cheaper than Gemini 2.5 Pro while still being very capable. Great for high-volume, lower-stakes work.
Cons: Benchmarks are lower than Pro across the board. Not ideal for tasks requiring peak reasoning or coding performance.
Local Models
Qwen 3.5 (qwen3.5:35b)
Best For: Deep research, literature reviews, digesting massive textbooks, and synthesizing multiple PDFs.
Target Majors: Law, History, Pre-Med, Computer Science, Literature, Graduate Researchers.
Pros: Features a massive context window (natively 262K tokens, which is roughly 400 pages of text). It activates only 3 billion parameters at a time, making it highly efficient while retaining frontier-level intelligence.
Cons: Even though the model supports large context window, the server GPU Vram limitation will limit the size of your document uploads. Keep document uploads reasonable.
Benchmarks: 70.2 IFBench (instruction following test), 84.2 GPQA Diamond (graduate level reasoning test), 85.2 MMMLU (multilingual knowledge), 89.3 OmniDocBench v1.5 (document recognition and understanding)
Qwen3 Coder Next (qwen3-coder-next:latest)
Best For: Complex programming assignments, navigating multi-file repositories, and long-form software engineering projects.
Target Majors: Computer Science, Software Engineering, Data Science, IT.
Pros: This is an absolute powerhouse for coding. It understands how different files in a codebase connect and can recover from execution errors natively. It features a 256K context window for dumping entire codebases into the chat.
Cons: It is a massive 80-billion-parameter model, taking up ~50GB of our server's disk space. It is highly specialized for code, meaning it will likely write dry, robotic essays if you try to use it for creative writing.
Benchmarks: Achieves 70.6% on SWE-Bench Verified (a benchmark testing real-world GitHub issue resolution), matching or beating models that require 10x the computing power.
GLM-4.7 Flash (glm-4.7-flash:latest)
Best For: Web development, UI/UX prototyping, generating structured data, and multilingual translation.
Target Majors: Web Design, Human-Computer Interaction (HCI), Graphic Design, Business, Foreign Languages.
Pros: Blazing fast. This model is a master of "vibe coding", which means it is exceptionally good at understanding design specifications and generating aesthetically pleasing, modern HTML/CSS/JS. It's also great at formatting presentations and translating text naturally.
Cons: It trades deep, rigorous mathematical logic for raw speed and visual aesthetics.
Benchmarks: 59.2% SWE-Bench Verified (Lower than Qwen3 Coder Next)
OpenAI GPT-OSS (gpt-oss:20b)
Best For: Complex formulas, advanced physics, logical deduction, and step-by-step problem-solving.
Target Majors: Mathematics, Physics, Engineering, Economics, Philosophy (Formal Logic).
Pros: This is OpenAI's open-weight reasoning model, bringing their advanced o-series (like o3-mini) logic to our local hardware. It "thinks" before it answers, making it incredibly powerful for STEM homework that requires showing the work. At 21B parameters, it runs extremely smoothly on our 24GB VRAM limit.
Cons: The visible "chain of thought" can be very long and messy before it spits out the final answer. Its context window is slightly smaller (131K tokens) than the Qwen models.
Benchmarks: Matches or exceeds OpenAI's own o3-mini on core mathematical reasoning and health evaluations, achieving 68.8% on the graduate-level GPQA Diamond benchmark, 65% on IFBench
Qwen3 Next Base (qwen3-next:80b)
Best For: General-purpose writing, brainstorming, broad knowledge retrieval, and standard Q&A.
Target Majors: Communications, Business, Education, Liberal Arts, Undecided.
Pros: An excellent "jack of all trades." If you just need a standard chat model to help brainstorm an essay topic, explain a concept, or outline a project, this has the broadest baseline knowledge without getting hyper-focused on code or math.
Cons: Jack of all trades, master of none.
Benchmarks: 76% GPQA Diamond, 61% IFBench
Tips for Getting Better Results
- Be specific. The more context you give the AI, the better its response will be. Instead of "summarize this," try "summarize this in 3 bullet points for a non-technical audience."
- Iterate. If the first response isn't quite right, follow up with clarifying instructions rather than starting over.
- Try different models. Different models have different strengths — exploring those differences is exactly what this pilot is designed for!
- Save prompts you like. If you craft a really effective prompt, save it so you can reuse it easily.