| 일 | 월 | 화 | 수 | 목 | 금 | 토 |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | 3 | 4 | 5 | 6 | 7 | 8 |
| 9 | 10 | 11 | 12 | 13 | 14 | 15 |
| 16 | 17 | 18 | 19 | 20 | 21 | 22 |
| 23 | 24 | 25 | 26 | 27 | 28 | 29 |
| 30 |
- gres
- vlm
- blip-2
- 딥러닝 목적함수
- 기계학습
- 엔트로피란
- referring expression segmentation
- clip
- gsoc 2025
- 딥러닝 엔트로피
- gsoc
- mobilenetv1
- object detection
- gsoc 후기
- 1차 미분 마스크
- google summer of code
- Object detection article
- 원격 학습 안끊기게
- 논문 리뷰
- grefcoco dataset
- res
- clip adapter
- grefcoco
- 에지 검출
- reparameterization
- Segmentation
- 논문 요약
- 객체 검출
- 이미지 필터링
- res paper
- Today
- Total
목록Paper (51)
My Vision, Computer Vision
SPICE: Semantic Propositional Image Caption EvaluationThere is considerable interest in the task of automatically generating image captions. However, evaluation is challenging. Existing automatic evaluation metrics are primarily sensitive to n-gram overlap, which is neither necessary nor sufficient for the taarxiv.orgJournal : ECCV 2016Published Date : 2016년 9월 16일keyword : Evaluation Metric, SP..
CIDEr: Consensus-based Image Description EvaluationJournal : CVPR 2015Published Date : 2014년 11월 20일Keyword : CIDEr score, Evaluation Metric, Microsoft CIDEr: Consensus-based Image Description EvaluationAutomatically describing an image with a sentence is a long-standing challenge in computer vision and natural language processing. Due to recent progress in object detection, attribute classifica..
BLEU | Proceedings of the 40th Annual Meeting on Association for Computational LinguisticsWe present the results of an experiment on extending the automatic method of Machine Translation evaluation BLUE with statistical weights for lexical items, such as tf.idf scores. We show that this extension gives additional information about evaluated ...dl.acm.org Published Date : 2002년 7월 1일keyword : BLE..
Learning to Prompt for Vision-Language ModelsLarge pre-trained vision-language models like CLIP have shown great potential in learning representations that are transferable across a wide range of downstream tasks. Different from the traditional representation learning that is based mostly on discretiarxiv.org발행일 : 2021년 9월 2일저널/학회 : SPRINGER 2022ProblemCLIP과 같은 기존 VLM은 프롬프트 엔지니어링을 통한 Zero-shot t..
LLaMA: Open and Efficient Foundation Language ModelsWe introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, withoarxiv.org발행일 : 2023. 02. 27.Meta AIProblem최근, 한정된 컴퓨터 예산(Budget)에서 LLM과 데이터셋 크기에 대한 최적화 연구..
BEiT: BERT Pre-Training of Image TransformersWe introduce a self-supervised vision representation model BEiT, which stands for Bidirectional Encoder representation from Image Transformers. Following BERT developed in the natural language processing area, we propose a masked image modeling task to prearxiv.orgProblemBEIT(Bidirectional Encoder representation from Image Transformers)를 제안한다.ViT는 CNN..