'Paper' 카테고리의 글 목록 (2 Page)

Notice

Recent Posts

Recent Comments

Link

« 2025/12 »
일	월	화	수	목	금	토
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

목록Paper (51)

My Vision, Computer Vision

[논문 요약/리뷰] Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabulary Segmentation

Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabulary SegmentationOpen-Vocabulary Segmentation (OVS) aims at segmenting images from free-form textual concepts without predefined training classes. While existing vision-language models such as CLIP can generate segmentation masks by leveraging coarse spatial information frarxiv.orgAuthor : Barsellotti, Luca, ..

Paper 2025. 5. 2. 19:43

[논문 요약/리뷰] LAVT: Language-Aware Vision Transformer for Referring Image Segmentation

LAVT: Language-Aware Vision Transformer for Referring Image SegmentationReferring image segmentation is a fundamental vision-language task that aims to segment out an object referred to by a natural language expression from an image. One of the key challenges behind this task is leveraging the referring expression for highligharxiv.orgAuthor : Yang, Zhao, et alJournal : CVPR 2022Keyword : PWAM, ..

Paper 2025. 5. 2. 18:05

[논문 요약/리뷰] A Survey on Hallucination in Large Vision-Language Models

A Survey on Hallucination in Large Vision-Language ModelsRecent development of Large Vision-Language Models (LVLMs) has attracted growing attention within the AI landscape for its practical implementation potential. However, ``hallucination'', or more specifically, the misalignment between factual visual contentarxiv.org Author : Liu, Hanchao, et al.Journal : ArxivKeyword : Survey, Vision Langau..

Paper 2025. 4. 24. 18:26

[논문 리뷰/요약] GRES: Generalized Referring Expression Segmentation

GRES: Generalized Referring Expression SegmentationReferring Expression Segmentation (RES) aims to generate a segmentation mask for the object described by a given language expression. Existing classic RES datasets and methods commonly support single-target expressions only, i.e., one expression refers toarxiv.orgAuthor : Liu, Chang, Henghui Ding, and Xudong Jiang.Journal : CVPR 2023Keyword : Re..

Paper 2025. 4. 16. 12:38

[논문 요약/리뷰] DINOv2: Learning Robust Visual Features without Supervision

DINOv2: Learning Robust Visual Features without SupervisionThe recent breakthroughs in natural language processing for model pretraining on large quantities of data have opened the way for similar foundation models in computer vision. These models could greatly simplify the use of images in any system by producingarxiv.org Author : MLAOquab, Maxime, et al.Journal : ArxivKeyword : dinov2Published..

Paper 2025. 3. 31. 14:39

[논문 요약/리뷰] MoVE-KD: Knowledge Distillation for VLMs with Mixture of Visual Encoders

MoVE-KD: Knowledge Distillation for VLMs with Mixture of Visual EncodersVisual encoders are fundamental components in vision-language models (VLMs), each showcasing unique strengths derived from various pre-trained visual foundation models. To leverage the various capabilities of these encoders, recent studies incorporate multarxiv.orgAuthor : Cao, Jiajun, et al.Journal : ArxivKeyword : Knowledg..

Paper 2025. 3. 31. 14:35

이전 Prev 1 2 3 4 5 ··· 9 Next 다음

목록Paper (51)

My Vision, Computer Vision

티스토리툴바