點擊上方「CVer」,選擇「置頂公眾號」
重磅乾貨,第一時間送達
前戲
Amusi 將日常整理的論文都會同步發布到 daily-paper-computer-vision 上。名字有點露骨,還請見諒。喜歡的童鞋,歡迎star、fork和pull。
直接點擊「閱讀全文」即可訪問daily-paper-computer-vision。
link: https://github.com/amusi/daily-paper-computer-vision
ECCV 2018是計算機視覺領域中的頂級會議,目前已經公開了部分已錄用的paper。CVer 已經推送了八篇 ECCV 2018論文速遞推文:
[計算機視覺論文速遞] ECCV 2018 專場1
[計算機視覺論文速遞] ECCV 2018 專場2
[計算機視覺論文速遞] ECCV 2018 專場3
[計算機視覺論文速遞] ECCV 2018 專場4
[計算機視覺論文速遞] ECCV 2018 專場5
[計算機視覺論文速遞] ECCV 2018 專場6
[計算機視覺論文速遞] ECCV 2018 專場7
[計算機視覺論文速遞] ECCV 2018 專場8
Semantic Segmentation
《Concept Mask: Large-Scale Segmentation from Semantic Concepts》
ECCV 2018
Overall architecture of the proposed framework
Three stages of the training framework
Abstract:Existing works on semantic segmentation typically consider a small number of labels, ranging from tens to a few hundreds. With a large number of labels, training and evaluation of such task become extremely challenging due to correlation between labels and lack of datasets with complete annotations. We formulate semantic segmentation as a problem of image segmentation given a semantic concept, and propose a novel system which can potentially handle an unlimited number of concepts, including objects, parts, stuff, and attributes. We achieve this using a weakly and semi-supervised framework leveraging multiple datasets with different levels of supervision. We first train a deep neural network on a 6M stock image dataset with only image-level labels to learn visual-semantic embedding on 18K concepts. Then, we refine and extend the embedding network to predict an attention map, using a curated dataset with bounding box annotations on 750 concepts. Finally, we train an attention-driven class agnostic segmentation network using an 80-category fully annotated dataset. We perform extensive experiments to validate that the proposed system performs competitively to the state of the art on fully supervised concepts, and is capable of producing accurate segmentations for weakly learned and unseen concepts.
CVer
Welcome to click AD
摘要:關於語義分割的現有工作通常考慮少量標籤,範圍從幾十到幾百。由於標籤之間的相關性以及缺少具有完整注釋的數據集,因此對於大量標籤,對此類任務的訓練和評估變得極具挑戰性。我們將語義分割表示為給定語義概念的圖像分割問題,並提出一種新穎的系統,它可以處理無限數量的概念,包括對象,部件,東西和屬性。我們使用弱監督和半監督框架來實現這一目標,該框架利用具有不同監督級別的多個數據集。我們首先在6M圖像數據集上訓練深度神經網絡,僅使用圖像級標籤來學習18K概念的視覺語義嵌入。然後,我們使用帶有750個概念的邊界框注釋的curated 數據集來優化和擴展嵌入網絡以預測注意力圖。最後,我們使用80類完全注釋的數據集訓練注意力驅動的類不可知分割網絡。我們進行了大量實驗,以驗證所提出的系統在完全監督的概念上與現有技術相比具有競爭力,並且能夠為弱學習和看不見的概念產生準確的分割。
arXiv:https://arxiv.org/abs/1808.06032
Monocular Depth Estimation
《Learning Monocular Depth by Distilling Cross-domain Stereo Networks》
ECCV 2018
Train monocular depth network by distilling stereo network
Visualization of depth maps of different methods on KITTI test set
Abstract:Monocular depth estimation aims at estimating a pixelwise depth map for a single image, which has wide applications in scene understanding and autonomous driving. Existing supervised and unsupervised methods face great challenges. Supervised methods require large amounts of depth measurement data, which are generally difficult to obtain, while unsupervised methods are usually limited in estimation accuracy. Synthetic data generated by graphics engines provide a possible solution for collecting large amounts of depth data. However, the large domain gaps between synthetic and realistic data make directly training with them challenging. In this paper, we propose to use the stereo matching network as a proxy to learn depth from synthetic data and use predicted stereo disparity maps for supervising the monocular depth estimation network. Cross-domain synthetic data could be fully utilized in this novel framework. Different strategies are proposed to ensure learned depth perception capability well transferred across different domains. Our extensive experiments show state-of-the-art results of monocular depth estimation on KITTI dataset.
摘要:單目深度估計旨在估計單個圖像的像素深度圖,其在場景理解和自動駕駛中具有廣泛的應用。現有的監督和無監督方法面臨巨大挑戰。監督方法需要大量深度測量數據,這些數據通常難以獲得,而無監督方法通常在估計精度方面受到限制。合成數據為收集大量深度數據提供了可能的解決方案。然而,合成數據和實際數據之間存在較大的域(domain)差距,這使得直接訓練具有一定挑戰性。在本文中,我們建議使用立體匹配網絡作為proxy 來從合成數據中學習深度,並使用預測的立體視差圖來監督單目深度估計網絡。跨域合成數據可以在這個新穎的框架中得到充分利用。提出了不同的策略來確保學習深度感知能力在不同域之間良好地傳遞。我們的廣泛實驗顯示了KITTI數據集上單目深度估計的最新結果。
arXiv:https://arxiv.org/abs/1808.06586
希望上述兩篇ECCV 2018 最新的paper可以給你帶來一點靈感~
歡迎給CVer點讚和轉發
▲長按關注我們