Go to file
2026-05-06 17:46:06 +08:00
.cargo mac打包 2026-03-19 11:54:44 +08:00
readme 更新配图 2026-05-04 15:07:36 +08:00
scripts 新增x86编译 2026-05-06 15:15:54 +08:00
src 优化提示 2026-05-04 14:49:15 +08:00
src-tauri 处理api失败频繁channel closed 2026-05-06 17:46:06 +08:00
.gitignore 新增x86编译 2026-05-06 15:15:54 +08:00
agent.md init 2026-03-18 15:36:08 +08:00
index.html 更新ui 2026-04-28 18:33:11 +08:00
LICENSE 新增md 2026-05-04 15:02:40 +08:00
package-lock.json 新增i18n 2026-04-30 17:56:20 +08:00
package.json 新增x86编译 2026-05-06 15:15:54 +08:00
postcss.config.js init 2026-03-18 15:36:08 +08:00
README.en.md 新增x86编译 2026-05-06 15:15:54 +08:00
README.md 新增x86编译 2026-05-06 15:15:54 +08:00
tailwind.config.js 更新ui 2026-04-28 18:33:11 +08:00
tsconfig.json init 2026-03-18 15:36:08 +08:00
vite.config.ts 初始化 2026-03-18 22:14:49 +08:00
yarn.lock 新增分页,新增win打包 2026-05-01 22:31:32 +08:00

CrossSubtitle-AI 截图

CrossSubtitle-AI

AI-Powered, Local-First Subtitle Workbench

GitHub Release GitHub License Platform

English · 简体中文


About

CrossSubtitle-AI is a local-first audio/video subtitle processing tool. It uses Whisper for speech recognition, Silero VAD for voice activity detection, and supports OpenAI-compatible APIs for intelligent translation — helping you quickly transcribe and translate media files into bilingual subtitles.

All speech recognition runs locally on your machine. No audio or video files are ever uploaded to any server, ensuring your data privacy.

Features

  • Speech Recognition — High-accuracy speech-to-text powered by Whisper, supporting 17 source languages including Chinese, English, Japanese, Korean, French, and more
  • Voice Activity Detection — Silero VAD precisely splits speech segments and automatically filters out silence
  • Smart Translation — Connect to any OpenAI-compatible API (GLM, DeepSeek, ChatGPT, etc.) to translate transcripts into your target language
  • Audio Extraction — Built-in FFmpeg automatically extracts audio and converts to 16kHz mono WAV
  • Multiple Export Formats — Export subtitles in SRT, VTT, and ASS formats
  • Bilingual Export — Export side-by-side original + translated bilingual subtitles
  • Subtitle Editor — Built-in editor for modifying both source text and translations line by line
  • Drag & Drop — Drag and drop files to quickly create tasks
  • Task Queue — Batch process multiple media files with real-time progress tracking
  • Bilingual UI — Switch between Chinese and English interface languages
  • Local-First — Speech recognition runs entirely locally, no data upload required

Workflow

  1. Choose Mode — Select "Source" for transcription only, or "Translate" mode for automatic translation after transcription
  2. Add Task — Click "Add Task" or drag-and-drop media files onto the window
  3. Wait for Processing — Tasks go through: Audio Extraction → VAD Segmentation → Speech Recognition → (Optional) Translation
  4. Review & Edit — View and modify recognition results and translations in the subtitle editor
  5. Export Subtitles — Export as SRT, VTT, or ASS format

Screenshots

Subtitle Subtitle Editor
Subtitle Subtitle Editor

Installation

Download the installer for your platform from GitHub Releases:

Platform Package
macOS (Apple Silicon) .dmg
Windows .exe (NSIS Installer)

Usage

Quick Start

  1. Open the app and select a mode from the top toolbar:
    • Source — Speech recognition only, outputs source language subtitles
    • Translate — Transcribes then translates via an LLM API
  2. Click "Add Task" or drag-and-drop files onto the window
  3. Wait for processing to complete
  4. Review and edit results in the subtitle editor on the right
  5. Click "Export" to save subtitles in your preferred format

Translation Configuration

Before using the translation feature, configure the LLM API:

  • Fill in the LLM API Base, API Key, and Model in "Advanced Settings"
  • Works with any OpenAI-compatible service, including:
    • GLM (Zhipu AI) — GLM-4.7-Flash available for free
    • DeepSeek
    • ChatGPT
    • Self-hosted — Ollama, vLLM, etc.

Advanced Settings

  • Whisper Model Path — Path to a local ggml model file
  • VAD Model Path — Path to a local Silero VAD ONNX model file
  • Batch Size — Number of segments to translate per batch (10-15)
  • Context Size — Number of preceding segments to include as context for translation (0-5)

Development

Prerequisites

  • Rust toolchain
  • Node.js (18+)
  • FFmpeg (must be available on the command line)
  • CMake (required for compiling whisper-rs)

Local Development

# Clone the repository
git clone https://github.com/AndySkaura/crosssubtitle-ai.git
cd crosssubtitle-ai

# Install frontend dependencies
npm install

# Start development mode
npm run tauri-dev

Build

# macOS DMG build
npm run tauri-build-dmg

# Windows NSIS build
npm run tauri-build-windows

Tech Stack

Layer Technology
Desktop Framework Tauri v2
Frontend Vue 3 + TypeScript
State Management Pinia
Styling Tailwind CSS
Internationalization vue-i18n
Speech Recognition whisper-rs (Whisper)
Voice Detection ort (Silero VAD ONNX)
Audio Processing FFmpeg
LLM Translation OpenAI-compatible API

Project Structure

src/                      Vue frontend
  components/             UI components (TaskQueue, SubtitleEditor)
  stores/                 Pinia state management
  locales/                i18n locale files (zh-CN, en)
  lib/                    Type definitions
src-tauri/                Rust backend
  src/
    audio.rs              Audio extraction & WAV reading
    vad.rs                Silero VAD voice activity detection
    whisper.rs            Whisper speech recognition interface
    translate.rs          OpenAI-compatible translation interface
    subtitle.rs           SRT / VTT / ASS export
    task.rs               Task orchestration & event broadcasting
    state.rs              Application state

License

This project is licensed under the MIT License.

Acknowledgements

  • whisper.cpp — High-performance Whisper inference implementation
  • Silero VAD — High-accuracy voice activity detection
  • Tauri — Lightweight desktop application framework
  • All contributors and users

Made by kuraa