일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | ||||
4 | 5 | 6 | 7 | 8 | 9 | 10 |
11 | 12 | 13 | 14 | 15 | 16 | 17 |
18 | 19 | 20 | 21 | 22 | 23 | 24 |
25 | 26 | 27 | 28 | 29 | 30 | 31 |
- 논문 요약
- albert paper
- 딥러닝 엔트로피
- 구글 서머 오브 코드 합격 후기
- gsoc 후기
- gsoc 2025
- clip adapter
- mobilenetv1
- albert 모델
- object detection
- gsoc 2025 후기
- 구글 서머 오브 코드
- albert 논문
- 엔트로피란
- Object detection article
- clip
- albert: a lite bert for self-supervised learning of language representations
- 에지 검출
- 이미지 필터링
- 딥러닝 목적함수
- 원격 학습 안끊기게
- blip-2
- albert 논문 리뷰
- 구글 서머 오브 코드 후기
- 기계학습
- 논문 리뷰
- vlm
- gsoc 합격 후기
- 1차 미분 마스크
- 객체 검출
- Today
- Total
목록Paper (46)
My Vision, Computer Vision

VGA: Vision GUI Assistant -- Minimizing Hallucinations through Image-Centric Fine-TuningRecent advances in Large Vision-Language Models (LVLMs) have significantly improve performance in image comprehension tasks, such as formatted charts and rich-content images. Yet, Graphical User Interface (GUI) pose a greater challenge due to their structuarxiv.orgAbstract기존 VLM은 시각적 입력을 무시하고 텍스트에 과도하게 의존하는 경..

REDQT: a method for automated mobile application GUI testing based on deep reinforcement learning algorithmsAs mobile applications become increasingly prevalent in daily life, the demand for their functionality and reliability continues to grow. Traditional mobile application testing methods, particularly graphical user interface (GUI) testing, face …www.springerprofessional.deAbstract이 논문은 심층 강..

End-to-End Object Detection with TransformersWe present a new method that views object detection as a direct set prediction problem. Our approach streamlines the detection pipeline, effectively removing the need for many hand-designed components like a non-maximum suppression procedure or anchor genearxiv.orgAbstractDETR은 Object detection을 Direct set prediction problem으로 본다.또한 NMS, Anchor genera..

An Image is Worth 16x16 Words: Transformers for Image Recognition at ScaleWhile the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited. In vision, attention is either applied in conjunction with convolutional networks, or used to reparxiv.orgAbstractTransformer가 사실상 NLP 분야의 표준이 되었지만 Computer vision에 ..

Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNsWe revisit large kernel design in modern convolutional neural networks (CNNs). Inspired by recent advances in vision transformers (ViTs), in this paper, we demonstrate that using a few large convolutional kernels instead of a stack of small kernels could barxiv.org이 논문은 2022년 3월 CVPR에서 발표되었다.Abstract이 논문은 ViT(Vision Transfor..

MobileNetV2: Inverted Residuals and Linear BottlenecksIn this paper we describe a new mobile architecture, MobileNetV2, that improves the state of the art performance of mobile models on multiple tasks and benchmarks as well as across a spectrum of different model sizes. We also describe efficient ways of apparxiv.orgAbstractMobileNet V1의 성능을 개선Object detection에 효율적인 적용 방법 SSDLite 소개Semantic seg..