Yujiaxuan Wang | Portfolio

About

What I build

I build projects around large language models, multimodal reasoning, local-first AI workflows, and human-centered interfaces. My work combines AI pipelines, retrieval systems, lightweight product design, and interactive demos that make complex model behavior easier to inspect, search, and explain.

Projects

Selected work

Personal AI Project

vision-index

A local image indexing and semantic search system that analyzes images with a vision-language model through Ollama and serves retrieval through a lightweight FastAPI dashboard.

Scans or uploads images, generates thumbnails, extracts structured metadata, and stores search-ready indices.
Combines SQLite metadata, Chroma embeddings, and field-aware reranking for natural-language image search.
Built as a practical local-first AI application around ingestion, indexing, retrieval, and evaluation.

Source Code

Thesis Showcase

GeoMindMap

An end-to-end visualization framework for inspecting how multimodal LLMs reason through image geo-localization tasks step by step.

Turns free-form reasoning into structured clues, location hypotheses, and dynamic maps.
Designed as a thesis-facing demo that makes model reasoning easier to explain in interviews and research discussions.
Includes a live interactive webpage and a separate source repository with core logic.

Live Demo Source Code

Hackathon Build

TOCwise

An AI semantic table of contents for long blogs and chat histories, built to improve navigation inside dense conversational content.

Built in a two-person team for the Chrome Built-In AI Hackathon 2025.
Runs offline with Chrome built-in AI capabilities.
Supports jump navigation, segmentation, editable headings, dark mode, and search.

Source Code

Contact

Links

ge84qof@mytum.de @wyjxx on GitHub