基于学习型哈希的在线近邻查找算法
CSTR:
作者:
作者单位:

(宁波大学信息科学与工程学院,浙江宁波315211)

作者简介:

通讯作者:

E-mail: huweiweis@foxmail.com.

中图分类号:

TP18

基金项目:

国家自然科学基金项目(61472194, 61572266);浙江省自然科学基金项目(LZ20F020001,LY20F020009,LY16F020003);宁波市领军和拔尖人才培养工程择优科研项目(NBLJ201801003).


Online learning to Hash for nearest neighbor search
Author:
Affiliation:

(Faculty Electrical Engineering and Computer Science,Ningbo University,Ningbo315211,China)

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    基于哈希的近邻查找技术在图像检索、文本匹配、数据挖掘等信息检索领域均有广泛应用.该技术将原始数据通过哈希函数压缩成低维的二进制编码,然后在海明距离下排序检索,具有快速高效且维度不敏感的优势.但是,目前学术界针对流数据的实时在线哈希学习方法的研究很少,而且基本没有讨论哈希函数的更新频率和稳定性问题.针对这一问题,通过增加置信区间来减少更换哈希函数的频率,并构造在线学习的目标函数,使得算法尽可能保持稳定,且快速收敛.为了验证所提出算法的效率和有效性,在公开数据集上与同类的OSH、OKH在线哈希算法进行比较,比较结果表明,所提出的算法在平均准确率和训练时间上有一定优势.

    Abstract:

    Hash-based methods for nearest neighbor search has been widely used in information retrieval area, such as image retrieval, data mining and text match. The methods compress original data into low-dimensional binary codes by Hash functions, and then sort and search under Hamming distance. Therefore, the methods have the advantages of efficiency and dimension insensitivity for searching large-scale data. Currently, there is little literature that discuss on learning to online Hash-based methods for real-time dynamic streaming data. Furthermore, those methods do not discuss the update frequency of Hash functions and the stability. In order to solve the problem and improve the efficiency of learning to online Hash, the confidence interval is first designed to reduce the frequency of changing the Hash table, and the objective function is proposed to keep the Hash models as stable and convergent as possible. Compared with some related online Hash algorithms on several public large-scale datasets, the proposed method is competitive under the average accuracy and training time.

    参考文献
    相似文献
    引证文献
引用本文

钱江波,胡伟,陈华辉,等.基于学习型哈希的在线近邻查找算法[J].控制与决策,2019,34(12):2567-2575

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2019-12-04
  • 出版日期:
文章二维码