'Paper' 카테고리의 글 목록

Notice

Recent Posts

Recent Comments

Link

« 2025/11 »
일	월	화	수	목	금	토
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30

Tags more

Archives

Today

Total

관리 메뉴

목록Paper (51)

My Vision, Computer Vision

[논문 요약/리뷰] Golden Cudgel Network for Real-Time Semantic Segmentation

Golden Cudgel Network for Real-Time Semantic SegmentationRecent real-time semantic segmentation models, whether single-branch or multi-branch, achieve good performance and speed. However, their speed is limited by multi-path blocks, and some depend on high-performance teacher models for training. To overcome thearxiv.org초록기존 Real-time 시멘틱 세그멘테이션 모델은 Multi-path 블럭에 의해 속도가 제한되거나고성능 티쳐 모델과에 의존한다는 한..

Paper 2025. 11. 18. 20:46

[논문 요약/리뷰] CRIS: CLIP-Driven Referring Image Segmentation

Overview본 논문에서는 CLIP 모델을 REF(Referring Expression Segmentation) Task에 사용한다.동시에 Vision-Language Decoder 및 Text-to-pixel 대조 학습을 제안한다.Problem Statement(당시) Multi-modal 분야에서 CLIP 모델이 성공적인 결과를 보여준 바 있다. 따라서 본 논문에서는 CLIP을 RES에 도입한다.하지만 위 사진에서처럼, CLIP을 Naive하게 사용하는 방법으로는 최적의 성능을 내지 못하는데, 이유는 Pixel-level 예측 태스크인 RES와 다르게, CLIP은 Image-level(Contrastive)로 훈련되었기 때문이다.따라서 시각적 특징을 세부적으로 학습해야하는 목표와 맞지 않게, CLI..

Paper 2025. 7. 4. 19:15

[논문 요약/리뷰] GSVA: Generalized Segmentation via Multimodal Large Language Models

GSVA: Generalized Segmentation via Multimodal Large Language ModelsGeneralized Referring Expression Segmentation (GRES) extends the scope of classic RES to refer to multiple objects in one expression or identify the empty targets absent in the image. GRES poses challenges in modeling the complex spatial relationships of tarxiv.orgAuthor: Xia, Zhuofan, et al.Journal: CVPR 20204Published Date: 202..

Paper 2025. 6. 18. 13:44

[논문 요약/리뷰] Bring Adaptive Binding Prototypes to Generalized Referring Expression Segmentation

Bring Adaptive Binding Prototypes to Generalized Referring Expression SegmentationReferring Expression Segmentation (RES) has attracted rising attention, aiming to identify and segment objects based on natural language expressions. While substantial progress has been made in RES, the emergence of Generalized Referring Expression Segmentarxiv.orgAuthor : Li, Weize, et al.Journal : IEEE Transactio..

Paper 2025. 5. 28. 16:19

[논문 요약/리뷰] Vision Transformers for Dense Prediction

Vision Transformers for Dense PredictionWe introduce dense vision transformers, an architecture that leverages vision transformers in place of convolutional networks as a backbone for dense prediction tasks. We assemble tokens from various stages of the vision transformer into image-like represearxiv.org Author : Ranftl, R., Bochkovskiy, A., & Koltun, V.Journal : ICCV 2021Keyword : DPRPublished ..

Paper 2025. 5. 19. 19:11

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

ALBERT: A Lite BERT for Self-supervised Learning of Language RepresentationsIncreasing model size when pretraining natural language representations often results in improved performance on downstream tasks. However, at some point further model increases become harder due to GPU/TPU memory limitations and longer training times. Toarxiv.orgAuthor : Lan, Zhenzhong, et al.Journal : ICLR 2020Keyword ..

Paper 2025. 5. 16. 10:34

이전 Prev 1 2 3 4 ··· 9 Next 다음

목록Paper (51)

My Vision, Computer Vision

티스토리툴바