일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | ||||
4 | 5 | 6 | 7 | 8 | 9 | 10 |
11 | 12 | 13 | 14 | 15 | 16 | 17 |
18 | 19 | 20 | 21 | 22 | 23 | 24 |
25 | 26 | 27 | 28 | 29 | 30 | 31 |
- 객체 검출
- mobilenetv1
- blip-2
- res paper
- grefcoco
- referring expression segmentation
- vlm
- res
- 이미지 필터링
- 대학원 일상
- 딥러닝 목적함수
- 원격 학습 안끊기게
- grefcoco dataset
- 2호선 따라걷기
- 엔트로피란
- 논문 요약
- 기계학습
- 논문 리뷰
- 딥러닝 엔트로피
- clip
- Object detection article
- gsoc midterm evaluations
- 2호선 완주
- clip adapter
- 1차 미분 마스크
- 에지 검출
- gsoc 후기
- gsoc 2025
- gres
- object detection
- Today
- Total
목록2025/05 (6)
My Vision, Computer Vision

Bring Adaptive Binding Prototypes to Generalized Referring Expression SegmentationReferring Expression Segmentation (RES) has attracted rising attention, aiming to identify and segment objects based on natural language expressions. While substantial progress has been made in RES, the emergence of Generalized Referring Expression Segmentarxiv.orgAuthor : Li, Weize, et al.Journal : IEEE Transactio..

Vision Transformers for Dense PredictionWe introduce dense vision transformers, an architecture that leverages vision transformers in place of convolutional networks as a backbone for dense prediction tasks. We assemble tokens from various stages of the vision transformer into image-like represearxiv.org Author : Ranftl, R., Bochkovskiy, A., & Koltun, V.Journal : ICCV 2021Keyword : DPRPublished ..

ALBERT: A Lite BERT for Self-supervised Learning of Language RepresentationsIncreasing model size when pretraining natural language representations often results in improved performance on downstream tasks. However, at some point further model increases become harder due to GPU/TPU memory limitations and longer training times. Toarxiv.orgAuthor : Lan, Zhenzhong, et al.Journal : ICLR 2020Keyword ..

GSoC(Google Summer of Code) 2025구글 서머 오브 코드는 여름에 진행되는 오픈 소스 프로젝트이다.여러 해외 기업들이 프로젝트를 들고오면 프로젝트 당 학생 한명씩 맡고, 해당 기업 멘토들이 도와주고 피드백을 주는 그런 시스템이다.Organizations List를 보면 AI, Security, Web 등 필드 별로 구분되어 있고 AI 분야에는 무려 딥마인드도 있다.나는 인텔의 OpenVINO에 지원했는데(총 3개까지 가능한데 1개만 함), 딥러닝 모델을 간편하게 사용할 수 있게 해주는 툴킷이다.컨택부터 지원까지의 과정은 기업마다, 프로젝트 멘토마다 다른데 내 경험을 바탕으로 후기를 남긴다..프로젝트 공개 및 컨택(2/27 ~ 3/24)GSoC 2025는 2월 27일에 기업 별 프로젝..

Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabulary SegmentationOpen-Vocabulary Segmentation (OVS) aims at segmenting images from free-form textual concepts without predefined training classes. While existing vision-language models such as CLIP can generate segmentation masks by leveraging coarse spatial information frarxiv.orgAuthor : Barsellotti, Luca, ..

LAVT: Language-Aware Vision Transformer for Referring Image SegmentationReferring image segmentation is a fundamental vision-language task that aims to segment out an object referred to by a natural language expression from an image. One of the key challenges behind this task is leveraging the referring expression for highligharxiv.orgAuthor : Yang, Zhao, et alJournal : CVPR 2022Keyword : PWAM, ..