일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | ||
6 | 7 | 8 | 9 | 10 | 11 | 12 |
13 | 14 | 15 | 16 | 17 | 18 | 19 |
20 | 21 | 22 | 23 | 24 | 25 | 26 |
27 | 28 | 29 | 30 |
- Object detection article
- clip adapter
- 객체 검출
- vlm
- evaluating object hallucination in large vision-language models
- 원격 학습 안끊기게
- 딥러닝 목적함수
- 기계학습
- evaluating object hallucination in large vision-language models paper
- 이미지 필터링
- dinov2: learning robust visual features without supervision
- 엔트로피란
- dinov2: learning robust visual features without supervision 논문 리뷰
- 논문 리뷰
- vlm hallucination
- 에지 검출
- dinov2: learning robust visual features without supervision 논문
- dinov2 논문 리뷰
- 딥러닝 엔트로피
- vlm hallucination paper
- 논문 요약
- 1차 미분 마스크
- vlm 환각
- clip
- polling-based object probing evaluation
- vlm 환각이란
- mobilenetv1
- object detection
- evaluating object hallucination in large vision-language models 논문
- blip-2
- Today
- Total
목록Paper (41)
My Vision, Computer Vision

DINOv2: Learning Robust Visual Features without SupervisionThe recent breakthroughs in natural language processing for model pretraining on large quantities of data have opened the way for similar foundation models in computer vision. These models could greatly simplify the use of images in any system by producingarxiv.org Author : MLAOquab, Maxime, et al.Journal : ArxivKeyword : dinov2Published..

MoVE-KD: Knowledge Distillation for VLMs with Mixture of Visual EncodersVisual encoders are fundamental components in vision-language models (VLMs), each showcasing unique strengths derived from various pre-trained visual foundation models. To leverage the various capabilities of these encoders, recent studies incorporate multarxiv.orgAuthor : Cao, Jiajun, et al.Journal : ArxivKeyword : Knowledg..

EfficientVLM: Fast and Accurate Vision-Language Models via Knowledge Distillation and Modal-adaptive PruningPre-trained vision-language models (VLMs) have achieved impressive results in a range of vision-language tasks. However, popular VLMs usually consist of hundreds of millions of parameters which brings challenges for fine-tuning and deployment in real-worldarxiv.org Author : Wang, Tiannan, ..

DoRA: Weight-Decomposed Low-Rank AdaptationAmong the widely used parameter-efficient fine-tuning (PEFT) methods, LoRA and its variants have gained considerable popularity because of avoiding additional inference costs. However, there still often exists an accuracy gap between these methods and fullarxiv.org Author : Liu, Shih-Yang, et al.Journal : ICML 2024Keyword : DoRAPublished Date : 2024년 2월..

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate ShiftTraining Deep Neural Networks is complicated by the fact that the distribution of each layer's inputs changes during training, as the parameters of the previous layers change. This slows down the training by requiring lower learning rates and careful paramarxiv.org Author : Ioffe, Sergey, and Christian Sz..

LoRA: Low-Rank Adaptation of Large Language ModelsAn important paradigm of natural language processing consists of large-scale pre-training on general domain data and adaptation to particular tasks or domains. As we pre-train larger models, full fine-tuning, which retrains all model parameters, becomes learxiv.orgJournal : ICLR 2022Published Date : 2021년 6월 17일Keyword : LLM, RANK Abstract모델의 크기가..