加载头像
多模态
2024
【论文笔记】CDFSL-V: Cross-Domain Few-Shot Learning for Videos
【论文笔记】CDFSL-V: Cross-Domain Few-Shot Learning for Videos1
【论文笔记】CoSign: Exploring Co-occurrence Signals in Skeleton-based Continuous Sign Language Recognition
【论文笔记】CoSign: Exploring Co-occurrence Signals in Skeleton-based Continuous Sign Language Recognition2
【论文笔记】Visual Alignment Pre-training for Sign Language Translation
【论文笔记】Visual Alignment Pre-training for Sign Language Translation3
【论文笔记】CiCo: Domain-Aware Sign Language Retrieval via Cross-Lingual Contrastive Learning
【论文笔记】CiCo: Domain-Aware Sign Language Retrieval via Cross-Lingual Contrastive Learning4
【论文笔记】Multimodal Cross-Domain Few-Shot Learning for Egocentric Action Recognition
【论文笔记】Multimodal Cross-Domain Few-Shot Learning for Egocentric Action Recognition5
【论文笔记】Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
【论文笔记】Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks6
【论文笔记】CLIP-guided Prototype Modulating for Few-shot Action Recognition
【论文笔记】CLIP-guided Prototype Modulating for Few-shot Action Recognition7
【论文笔记】How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites
【论文笔记】How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites8
【论文笔记】Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion
【论文笔记】Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion9
【论文笔记】VisionZip: Longer is Better but Not Necessary in Vision Language Models
【论文笔记】VisionZip: Longer is Better but Not Necessary in Vision Language Models10
引用到评论
随便逛逛博客分类文章标签
复制地址关闭热评深色模式轉為繁體