注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

Koala++'s blog

计算广告学 RTB

 
 
 

日志

 
 

Lucene源代码分析[-1]  

2009-07-07 11:09:47|  分类: Lucene |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |

         开始我们再写一个什么意思都没有例子作为开始:

package forfun;

 

import org.apache.lucene.search.IndexSearcher;

 

public class SearcherTest

{

    public static void main( String[] args ) throws Exception

    {

       IndexSearcher searcher = new IndexSearcher( "E:\\a" );

       searcher.close();

    }

}

         看一下构造函数:

/** Creates a searcher searching the index in the named directory. */

public IndexSearcher(String path) throws IOException {

    this(IndexReader.open(path), true);

}

         可以看到还是要先用IndexReader,这个我们已经看过一些了:

public static IndexReader open(String path) throws IOException {

    return open(FSDirectory.getDirectory(path, false), true);

}

         我们看到FSDirectory就知道,差不多快结束了:

public static FSDirectory getDirectory(String path, boolean create)

       throws IOException {

    return getDirectory(new File(path), create);

}

public static FSDirectory getDirectory(File file, boolean create)

       throws IOException {

    file = new File(file.getCanonicalPath());

    FSDirectory dir;

    synchronized (DIRECTORIES) {

       dir = (FSDirectory) DIRECTORIES.get(file);

       if (dir == null) {

           try {

              dir = (FSDirectory) IMPL.newInstance();

           } catch (Exception e) {

              throw new RuntimeException(

                     "cannot load FSDirectory class: " + e.toString());

           }

           dir.init(file, create);

           DIRECTORIES.put(file, dir);

       } else if (create) {

           dir.create();

       }

    }

    synchronized (dir) {

       dir.refCount++;

    }

    return dir;

}

         我们已经见过一次了,是在建索引的第一个例子中,这里就不再重复了。我们接着看IndexReaderopen函数:

private static IndexReader open(final Directory directory,

       final boolean closeDirectory) throws IOException {

    synchronized (directory) { // in- & inter-process sync

       return (IndexReader) new Lock.With(directory

              .makeLock(IndexWriter.COMMIT_LOCK_NAME),

              IndexWriter.COMMIT_LOCK_TIMEOUT) {

           public Object doBody() throws IOException {

              SegmentInfos infos = new SegmentInfos();

              infos.read(directory);

              if (infos.size() == 1) { // index is optimized

                  return SegmentReader.get(infos, infos.info(0),

                         closeDirectory);

              }

              IndexReader[] readers = new IndexReader[infos.size()];

              for (int i = 0; i < infos.size(); i++)

                  readers[i] = SegmentReader.get(infos.info(i));

              return new MultiReader(directory, infos, closeDirectory,

                     readers);

 

           }

       }.run();

    }

}

         我们看doBody里面的SegmentInfosread函数,当时我们见过它的write函数:

public final void read(Directory directory) throws IOException {

    IndexInput input = directory.openInput(IndexFileNames.SEGMENTS);

    try {

       int format = input.readInt();

       if (format < 0) { // file contains explicit format info

           // check that it is a format we can understand

           if (format < FORMAT)

              throw new IOException("Unknown format version: " + format);

           version = input.readLong(); // read version

           counter = input.readInt(); // read counter

       } else { // file is in old format without explicit format info

           counter = format;

       }

 

       for (int i = input.readInt(); i > 0; i--) { // read segmentInfos

           SegmentInfo si = new SegmentInfo(input.readString(), input

                  .readInt(), directory);

           addElement(si);

       }

 

       if (format >= 0) {

           if (input.getFilePointer() >= input.length())

              version = System.currentTimeMillis();

           else

              version = input.readLong(); // read version

       }

    } finally {

       input.close();

    }

}

         1.9版的lucenesegment文件名是segments (后面的版本就不是了,还有编)Write函数中第一个写入的是FORMAT,它的值为-1,对于过时的版本,我们就不管了,如果是正常的版本,那么就读取它的versioncounter,下面是write函数:

public final void write(Directory directory) throws IOException {

    IndexOutput output = directory.createOutput("segments.new");

    try {

       output.writeInt(FORMAT); // write FORMAT

       output.writeLong(++version); // every write changes the index

       output.writeInt(counter); // write counter

       output.writeInt(size()); // write infos

       for (int i = 0; i < size(); i++) {

           SegmentInfo si = info(i);

           output.writeString(si.name);

           output.writeInt(si.docCount);

       }

    } finally {

       output.close();

    }

 

    // install new segment info

    directory.renameFile("segments.new", IndexFileNames.SEGMENTS);

}

         Open函数中有一个我们没有见过的类是MultiReader,它是用来读多个索引库,这里先不管它,我们看SegmentReaderget函数:

public static SegmentReader get(SegmentInfos sis, SegmentInfo si,

       boolean closeDir) throws IOException {

    return get(si.dir, si, sis, closeDir, true);

}

 

public static SegmentReader get(Directory dir, SegmentInfo si,

       SegmentInfos sis, boolean closeDir, boolean ownDir)

       throws IOException {

    SegmentReader instance;

    try {

       instance = (SegmentReader) IMPL.newInstance();

    } catch (Exception e) {

       throw new RuntimeException("cannot load SegmentReader class: "

+ e);

    }

    instance.init(dir, sis, closeDir, ownDir);

    instance.initialize(si);

    return instance;

}

         看起来很熟悉,我们以前也见过,还是再看一次,加深一点印象,但是我只是列出来,不讲了:

void init(Directory directory, SegmentInfos segmentInfos,

       boolean closeDirectory, boolean directoryOwner) {

    this.directory = directory;

    this.segmentInfos = segmentInfos;

    this.directoryOwner = directoryOwner;

    this.closeDirectory = closeDirectory;

}

private void initialize(SegmentInfo si) throws IOException {

    segment = si.name;

 

    // Use compound file directory for some files, if it exists

    Directory cfsDir = directory();

    if (directory().fileExists(segment + ".cfs")) {

       cfsReader = new CompoundFileReader(directory(), segment + ".cfs");

       cfsDir = cfsReader;

    }

 

    // No compound file exists - use the multi-file format

    fieldInfos = new FieldInfos(cfsDir, segment + ".fnm");

    fieldsReader = new FieldsReader(cfsDir, segment, fieldInfos);

 

    tis = new TermInfosReader(cfsDir, segment, fieldInfos);

 

    // NOTE: the bitvector is stored using the regular directory, not cfs

    if (hasDeletions(si))

       deletedDocs = new BitVector(directory(), segment + ".del");

 

    // make sure that all index files have been read or are kept open

    // so that if an index update removes them we'll still have them

    freqStream = cfsDir.openInput(segment + ".frq");

    proxStream = cfsDir.openInput(segment + ".prx");

    openNorms(cfsDir);

 

    if (fieldInfos.hasVectors()) { // open term vector files only as needed

       termVectorsReaderOrig = new TermVectorsReader(cfsDir, segment,

              fieldInfos);

    }

}

         我们看一下IndexSearcher的构造函数:

private IndexSearcher(IndexReader r, boolean closeReader) {

    reader = r;

    this.closeReader = closeReader;

}

         我们看到没什么特别的,只是将closeReader置为true。我们例子中的最后一句话是close

public void close() throws IOException {

    if (closeReader)

       reader.close();

}

         IndexReaderclose也列出来:

public final synchronized void close() throws IOException {

    commit();

    doClose();

    if (closeDirectory)

       directory.close();

}

         我们看一下commit

protected final synchronized void commit() throws IOException {

    if (hasChanges) {

       if (directoryOwner) {

           synchronized (directory) { // in- & inter-process sync

              new Lock.With(directory

                     .makeLock(IndexWriter.COMMIT_LOCK_NAME),

                     IndexWriter.COMMIT_LOCK_TIMEOUT) {

                  public Object doBody() throws IOException {

                     doCommit();

                     segmentInfos.write(directory);

                     return null;

                  }

              }.run();

           }

           if (writeLock != null) {

              writeLock.release(); // release write lock

              writeLock = null;

           }

       } else

           doCommit();

    }

    hasChanges = false;

}

         因为我们没有做改动,hasChangesfalse,但是我们可以看一下,我们要把改动过的segmentInfos写到directory中去,也就是说你不close,或是commit那么这个索引就不会改变。再看一下doClose函数:

protected void doClose() throws IOException {

    fieldsReader.close();

    tis.close();

 

    if (freqStream != null)

       freqStream.close();

    if (proxStream != null)

       proxStream.close();

 

    closeNorms();

 

    if (termVectorsReaderOrig != null)

       termVectorsReaderOrig.close();

 

    if (cfsReader != null)

       cfsReader.close();

}

         这个函数就是把所有打开的文件都关闭。

 

 

 

  评论这张
 
阅读(1274)| 评论(0)
推荐 转载

历史上的今天

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2017