'분류 전체보기' 카테고리의 글 목록 (12 Page)

Notice

Recent Posts

Recent Comments

Link

« 2025/11 »
일	월	화	수	목	금	토
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30

Tags more

Archives

Today

Total

관리 메뉴

목록분류 전체보기 (86)

My Vision, Computer Vision

[논문 리뷰/요약] Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs

Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNsWe revisit large kernel design in modern convolutional neural networks (CNNs). Inspired by recent advances in vision transformers (ViTs), in this paper, we demonstrate that using a few large convolutional kernels instead of a stack of small kernels could barxiv.org이 논문은 2022년 3월 CVPR에서 발표되었다.Abstract이 논문은 ViT(Vision Transfor..

Paper 2024. 4. 26. 20:04

[논문 리뷰/요약]MobileNetV2: Inverted Residuals and Linear Bottlenecks

MobileNetV2: Inverted Residuals and Linear BottlenecksIn this paper we describe a new mobile architecture, MobileNetV2, that improves the state of the art performance of mobile models on multiple tasks and benchmarks as well as across a spectrum of different model sizes. We also describe efficient ways of apparxiv.orgAbstractMobileNet V1의 성능을 개선Object detection에 효율적인 적용 방법 SSDLite 소개Semantic seg..

Paper 2024. 4. 25. 14:06

[논문 리뷰/요약]MobileNetv1, MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications We present a class of efficient models called MobileNets for mobile and embedded vision applications. MobileNets are based on a streamlined architecture that uses depth-wise separable convolutions to build light weight deep neural networks. We introduce tw arxiv.org Abstract MobileNet 탄생 배경 : 모바일 및 임베디드 비전 응용 프로그램..

Paper 2024. 3. 29. 20:12

이미지에서 원하는 텍스트 뽑아내기, 이미지 캡셔닝 BLIP-2(Colab 가능)

LAVIS/projects/blip2 at main · salesforce/LAVIS LAVIS - A One-stop Library for Language-Vision Intelligence - salesforce/LAVIS github.com 이 글은 위 Github을 참고하여 작성했습니다. 본 글에서는 이미지 캡셔닝(Image2 Text) BLIP-2 모델 사용법을 알려드리려고 합니다. BLIP-2은 입력 이미지에 대한 Text를 출력해주는데, 사용자가 원하는 형태의 답변을 지정해 줄 수 있습니다. Google Colab T4(15GB) 환경에서 실행가능하며, 약 12GB 정도의 GPU memory를 사용합니다. 1. Install BLIP-2 패키지를 설치해줍니다. BLIP-2은 salesforc..

WorkPlace 2024. 3. 28. 15:40

이미지와 텍스트 유사도 측정하기, Open AI CLIP(Colab 가능)

GitHub - openai/CLIP: CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image - openai/CLIP github.com 이 글은 위 Github을 참고하여 작성했습니다. 본 글에서는 Open AI의 CLIP 모델 사용법을 알려드리려고 합니다. CLIP은 이미지와 텍스트를 혼합하여 학습시킨 모델입니다. 이미지와 텍스트를 입력하면 이미지와 텍스트 간의 유사도를 출력해 줍니다. Google..

WorkPlace 2024. 3. 27. 20:03

[오류 모음]youtube-dl, pytube, youtube.py(NotFoundError, like_count, OSError)

1. youtube-dl 패키지 오류 ModuleNotFoundError: No module named 'youtube_dl' 혹은 ImportError: pafy: youtube-dl not found; you can use the internal backend by setting the environmental variable PAFY_BACKEND to "internal". It is not enabled by default because it is not as well maintained as the youtube-dl backend. 와 같은 에러가 발생하면 아래 명령어를 터미널에 실행 pip install youtube-dl 끝! 2. KeyError: 'like_count' youtube-d..

환경 설정 2024. 3. 25. 17:04

이전 Prev 1 ··· 9 10 11 12 13 14 15 Next 다음

목록분류 전체보기 (86)

My Vision, Computer Vision

티스토리툴바