| 일 | 월 | 화 | 수 | 목 | 금 | 토 |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | 3 | 4 | 5 | 6 | 7 | 8 |
| 9 | 10 | 11 | 12 | 13 | 14 | 15 |
| 16 | 17 | 18 | 19 | 20 | 21 | 22 |
| 23 | 24 | 25 | 26 | 27 | 28 | 29 |
| 30 |
- grefcoco
- clip
- 논문 요약
- 논문 리뷰
- gsoc 2025
- res
- 1차 미분 마스크
- Segmentation
- gsoc 후기
- 이미지 필터링
- 객체 검출
- res paper
- blip-2
- referring expression segmentation
- gsoc
- reparameterization
- google summer of code
- vlm
- 기계학습
- 원격 학습 안끊기게
- 엔트로피란
- 딥러닝 목적함수
- clip adapter
- Object detection article
- 딥러닝 엔트로피
- 에지 검출
- mobilenetv1
- grefcoco dataset
- object detection
- gres
- Today
- Total
목록Paper (51)
My Vision, Computer Vision
REDQT: a method for automated mobile application GUI testing based on deep reinforcement learning algorithmsAs mobile applications become increasingly prevalent in daily life, the demand for their functionality and reliability continues to grow. Traditional mobile application testing methods, particularly graphical user interface (GUI) testing, face …www.springerprofessional.deAbstract이 논문은 심층 강..
End-to-End Object Detection with TransformersWe present a new method that views object detection as a direct set prediction problem. Our approach streamlines the detection pipeline, effectively removing the need for many hand-designed components like a non-maximum suppression procedure or anchor genearxiv.orgAbstractDETR은 Object detection을 Direct set prediction problem으로 본다.또한 NMS, Anchor genera..
An Image is Worth 16x16 Words: Transformers for Image Recognition at ScaleWhile the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited. In vision, attention is either applied in conjunction with convolutional networks, or used to reparxiv.orgAbstractTransformer가 사실상 NLP 분야의 표준이 되었지만 Computer vision에 ..
Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNsWe revisit large kernel design in modern convolutional neural networks (CNNs). Inspired by recent advances in vision transformers (ViTs), in this paper, we demonstrate that using a few large convolutional kernels instead of a stack of small kernels could barxiv.org이 논문은 2022년 3월 CVPR에서 발표되었다.Abstract이 논문은 ViT(Vision Transfor..
MobileNetV2: Inverted Residuals and Linear BottlenecksIn this paper we describe a new mobile architecture, MobileNetV2, that improves the state of the art performance of mobile models on multiple tasks and benchmarks as well as across a spectrum of different model sizes. We also describe efficient ways of apparxiv.orgAbstractMobileNet V1의 성능을 개선Object detection에 효율적인 적용 방법 SSDLite 소개Semantic seg..
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications We present a class of efficient models called MobileNets for mobile and embedded vision applications. MobileNets are based on a streamlined architecture that uses depth-wise separable convolutions to build light weight deep neural networks. We introduce tw arxiv.org Abstract MobileNet 탄생 배경 : 모바일 및 임베디드 비전 응용 프로그램..