일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | ||
6 | 7 | 8 | 9 | 10 | 11 | 12 |
13 | 14 | 15 | 16 | 17 | 18 | 19 |
20 | 21 | 22 | 23 | 24 | 25 | 26 |
27 | 28 | 29 | 30 |
- clip
- 에지 검출
- 논문 리뷰
- clip adapter
- vlm hallucination paper
- polling-based object probing evaluation
- 딥러닝 엔트로피
- referring expression segmentation
- grefcoco
- grefcoco dataset
- gres: generalized referring expression segmentation 논문
- 딥러닝 목적함수
- vlm 환각이란
- gres 논문
- object detection
- gres
- blip-2
- 논문 요약
- 1차 미분 마스크
- 기계학습
- mobilenetv1
- 이미지 필터링
- gres: generalized referring expression segmentation
- 엔트로피란
- gres 논문 리뷰
- Object detection article
- 원격 학습 안끊기게
- vlm
- gres: generalized referring expression segmentation 논문 리뷰
- 객체 검출
- Today
- Total
목록Paper (42)
My Vision, Computer Vision

Learning Transferable Visual Models From Natural Language SupervisionState-of-the-art computer vision systems are trained to predict a fixed set of predetermined object categories. This restricted form of supervision limits their generality and usability since additional labeled data is needed to specify any other visual coarxiv.orgAbstract기존 State-of-the-art 컴퓨터 비전 모델은 사전에 정의된, 고정된 객체 범주, Train..

Noise-contrastive estimation: A new estimation principle for unnormalized statistical modelsWe present a new estimation principle for parameterized statistical models. The idea is to perform nonlinear logistic regression to discriminate between the observed data and some artificially gene...proceedings.mlr.press이 논문은 대조 학습(Contrastive learning)의 개념을 수학적으로 설명한 논문이다.또한 Vision Langauge Model에서 주로 사..

VGA: Vision GUI Assistant -- Minimizing Hallucinations through Image-Centric Fine-TuningRecent advances in Large Vision-Language Models (LVLMs) have significantly improve performance in image comprehension tasks, such as formatted charts and rich-content images. Yet, Graphical User Interface (GUI) pose a greater challenge due to their structuarxiv.orgAbstract기존 VLM은 시각적 입력을 무시하고 텍스트에 과도하게 의존하는 경..

REDQT: a method for automated mobile application GUI testing based on deep reinforcement learning algorithmsAs mobile applications become increasingly prevalent in daily life, the demand for their functionality and reliability continues to grow. Traditional mobile application testing methods, particularly graphical user interface (GUI) testing, face …www.springerprofessional.deAbstract이 논문은 심층 강..

End-to-End Object Detection with TransformersWe present a new method that views object detection as a direct set prediction problem. Our approach streamlines the detection pipeline, effectively removing the need for many hand-designed components like a non-maximum suppression procedure or anchor genearxiv.orgAbstractDETR은 Object detection을 Direct set prediction problem으로 본다.또한 NMS, Anchor genera..

An Image is Worth 16x16 Words: Transformers for Image Recognition at ScaleWhile the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited. In vision, attention is either applied in conjunction with convolutional networks, or used to reparxiv.orgAbstractTransformer가 사실상 NLP 분야의 표준이 되었지만 Computer vision에 ..