在使用Zend Framework的Search_Lucene模块时,假设你有个文档已经加入到索引里面了,而这个文档后来被删除或者被修改了,需要及时更新索引才能保证数据的时效性,以前比较傻的办法就是全部重新创建一次索引,这个开销很大,也不适合大型应用,典型的场景就是论坛的帖子,如果帖子被删除或者修改了,就需要即使更新索引。
Zend_Search_Lucene官方文档关于删除和更新一个索引的说明实在太少,我自己琢磨了个简单的办法来实现,大家可以尝试一下,也许有更好的办法,知道的朋友可以告知我。
下面是官方文档的说明:
[coolcode lang=”php”]
find(‘path:’ . $removePath);
foreach ($hits as $hit) {
$index->delete($hit->id);
}
?>
[/coolcode]
这里头困惑的是$removePath这个东西,我是没有明白咋回事,下面说说我用的办法。
首先,假设我们的文档text都有个唯一的tid字段,那么我们就根据这个tid来作为每次删除和更新的依据,由于Lucene创建索引的时候,(我自己测试的)用数字类型无法成为keyword并且作为索引的字段,于是我们需要转换为字符串,这里我通过md5的方式把tid变成唯一的字符串,通过这个字符串来找到需要删除和更新的索引内容。
[coolcode lang=”php”]
//创建索引的时候,部分代码:
$index = Zend_Search_Lucene::create($this->lucne_index); //我类内部表示index路径的变量
$doc = new Zend_Search_Lucene_Document();
Zend_Search_Lucene_Analysis_Analyzer::setDefault(
new Zend_Search_Lucene_Analysis_Analyzer_Common_Utf8()); //根据你自己需要配置字符集
$doc->addField(Zend_Search_Lucene_Field::UnStored(‘key’, md5($tid)));
$doc->addField(Zend_Search_Lucene_Field::Text(‘title’, $title));
$doc->addField(Zend_Search_Lucene_Field::UnStored(‘content’, $content));
$index->addDocument($doc);
$index->commit();
//删除和更新索引的部分代码:
//先删除之
$key = md5($tid);
$index = Zend_Search_Lucene::open($this->lucne_index);
$query = Zend_Search_Lucene_Search_QueryParser::parse(“key:$key”, ‘utf-8’);
$hits = $index->find($query);
foreach ($hits AS $hit) {
$index->delete($hit->id);
}
//重新索引更新后的数据,代码和创建一样
$doc->addField(Zend_Search_Lucene_Field::UnStored(‘key’, md5($tid)));
$doc->addField(Zend_Search_Lucene_Field::Text(‘title’, $title));
$doc->addField(Zend_Search_Lucene_Field::UnStored(‘content’, $content));
$index->addDocument($doc);
$index->commit();
[/coolcode]
其实思路就是先找到要更新的内容,删之,然后把新的数据重新添加到索引。
抛砖引玉,欢迎交流。
Hi, Michael
我有个疑问。用 Zend_Search_Lucene 搜索之后的结果,如何进行分页呢? 能不能抛砖引玉一下 🙂
Best Regards,
Joel
目前,zend_search_lucene 还没有提供方便的分页方法,仅提供 Zend_Search_Lucene::getResultSetLimit() 和 Zend_Search_Lucene::setResultSetLimit()接口实现限制查询返回的结果数量,并不能支持起点游标,所以分页只能在取得结果集后自己去处理,这估计对性能有所影响,可以设置短时间内过期的内存缓存来存放结果数据,降低系统开销。
Hi Michael,
Thanks for the quick reply 🙂
从官方文档上看,的确不支持游标。但愿未来可以支持。
另 ,从官方文档上看,Zend_Search_Lucene 对索引文件有2G的限制。 这个限制在实际使用当中有什么办法可以避免? 大数据量下有啥 best practice 吗?
Thanks again ! Best Regards,
Joel
这个限制不是lucene的问题,是计算机32位和64位的本身限制。 另外,实际的lucene应用案例,曾经在国外见到这篇文章,希望有点帮助: http://www.phpriot.com/d/articles/php/search/zend-search-lucene/index.html
不错…
学习了。。。。
问题:
1.存放索引的index目录可不可以支持远程目录?
2.若有一数据列表,在不查询的情况下直接将其数据全部列出来,能不能用lucene实现,如何实现?
[Comment ID #30672 Will Be Quoted Here]
1. 远程目录可以用,但是性能肯定不好,而且共享的时候会有锁的问题
2. 如果知道id的话应该可以,具体的我也需要看看文档和源代码
目前个人感觉使用这个东西的场景还是有限,不太时候大规模应用,加上一级缓存应该能好一些。
吼吼,又来请教问题了,zend_search_lucene提供了一个highlightMatches函数,用来高亮查询的关键字,不知道Michael用过没,但是我用过后它把整个一段都高亮了,另外查询出来后出来一数据列表,点开数据列表后会有详细信息,那点开后的页面又如何高亮之前的关键字呢?
ZF新版本支持分页了吗?
更新时,要把原来的索引重新创建一次的? 数据多时不是太可怕了。
[Comment ID #31780 Will Be Quoted Here]
提醒我了,我好久没有去zf官网看看changelog了,赶快去
I couldn’t resist commenting. Well written!
Right now it appears like Expression Engine
is the top blogging platform out there right now. (from what I’ve read) Is that
what you’re using on your blog?
I’m impressed, I must say. Seldom do I encounter a blog that’s equally educative and amusing, and let me tell you,
you’ve hit the nail on the head. The problem is something that not enough folks are speaking intelligently about.
I’m very happy I found this during my search for something relating to
this.
Wow that was odd. I just wrote an really long comment but after I clicked
submit my comment didn’t show up. Grrrr… well I’m not writing
all that over again. Regardless, just wanted to say wonderful blog!
I like it when folks come together and share ideas.
Great website, continue the good work!
I think this is one of the most important information for me.
And i’m glad reading your article. But want to
remark on some general things, The web site style is ideal, the articles is really great : D.
Good job, cheers
Do you have a spam issue on this website; I also am a blogger,
and I was curious about your situation; many of us have developed some nice methods and we are looking to
exchange methods with other folks, please shoot me an email if interested.
I’m impressed, I have to admit. Rarely do I encounter a blog that’s both equally educative and amusing, and let me tell you, you have
hit the nail on the head. The issue is something that too few men and women are speaking
intelligently about. I’m very happy that I came across this during my hunt for something concerning this.
Hello just wanted to give you a quick heads up.
The words in your content seem to be running off the screen in Internet explorer.
I’m not sure if this is a format issue or something to do with internet browser compatibility but I thought I’d post to let you know.
The layout look great though! Hope you get the issue solved soon. Thanks
We stumbled over here coming from a different
web address and thought I should check things out. I like what I see
so i am just following you. Look forward to exploring
your web page again.
Hurrah, that’s what I was searching for, what a data! existing
here at this webpage, thanks admin of this site.
Write more, thats all I have to say. Literally,
it seems as though you relied on the video to make your point.
You definitely know what youre talking about, why throw away your intelligence on just posting videos to
your site when you could be giving us something enlightening to read?
I used to be suggested this web site by my cousin. I
am no longer certain whether this publish is written via him as
nobody else know such precise approximately my trouble. You are wonderful!
Thanks!
Terrific post however , I was wondering if
you could write a litte more on this subject?
I’d be very thankful if you could elaborate a little bit further.
Thank you!
Very good info. Lucky me I recently found your website by chance (stumbleupon).
I have saved as a favorite for later!
I have read some good stuff here. Definitely price bookmarking for revisiting.
I surprise how so much effort you set to create such a fantastic informative web site.
Appreciating the dedication you put into your blog and detailed information you present.
It’s awesome to come across a blog every once in a while that isn’t the same
outdated rehashed material. Great read! I’ve bookmarked your site and I’m adding your RSS feeds to my
Google account.
Very nice post. I simply stumbled upon your weblog and wanted to say
that I have really enjoyed browsing your weblog posts. In any case I will be subscribing in your feed and
I am hoping you write once more soon!
Very great post. I simply stumbled upon your weblog
and wished to mention that I’ve really enjoyed browsing your weblog posts.
In any case I’ll be subscribing in your feed and I’m hoping you write again soon!
Greetings from Ohio! I’m bored at work so I decided to browse your site on my iphone during lunch break.
I enjoy the knowledge you present here and can’t wait to take a look when I get home.
I’m surprised at how fast your blog loaded on my cell phone ..
I’m not even using WIFI, just 3G .. Anyways, superb site!
What you posted made a lot of sense. However, think about this, what
if you added a little content? I ain’t suggesting your content is not solid,
however suppose you added something to possibly get folk’s attention? I mean Zend_Search_Lucene更新Index的方法 | 李俊麟的平凡生活 is
a little boring. You ought to look at Yahoo’s home page and note how they write news headlines to get viewers interested.
You might add a video or a pic or two to get readers interested about everything’ve got to say.
In my opinion, it would make your website a little livelier.
I know this web page gives quality based articles or reviews
and additional material, is there any other web site which provides
these stuff in quality?
Incredible a good deal of wonderful knowledge!
canadian prescription drugstore
canadian pharmacy online
canadian pharmacies-24h
canadian pharmacy
canadian cialis – https://www.canadianpharmacyu.com/
canadapharmacy Jen 48e12f9
Hello colleagues, how is all, and what you would like to say regarding
this paragraph, in my view its really remarkable in favor of me.
Hi! I could have sworn I’ve visited this website before
but after browsing through many of the articles I realized it’s new to me.
Anyhow, I’m definitely delighted I discovered it and I’ll be book-marking it and checking
back regularly!
Excellent write-up. I definitely love this website.
Stick with it!
online casino real money
online casino real money
best online casino real money
online casino real money
best online casino real money
You can certainly see your expertise within the article you write.
The world hopes for even more passionate writers like you
who are not afraid to mention how they believe.
At all times go after your heart.
Appreciating the dedication you put into your website and in depth information you present.
It’s good to come across a blog every once in a while that isn’t
the same unwanted rehashed material. Fantastic read! I’ve saved your site and I’m including your RSS feeds to my
Google account.
I got this website from my friend who informed me about this web page and at
the moment this time I am visiting this site and reading very informative articles at this place.
You said it very well.!
viagra impotence
buy generic viagra
suppliers of viagra in uk
viagra label
Thank you for another great article. The place else may
just anybody get that type of info in such a perfect manner of writing?
I’ve a presentation subsequent week, and I’m at the look for such information.
Good information. Lucky me I ran across your blog by chance (stumbleupon).
I’ve bookmarked it for later!
It is truly a nice and useful piece of information. I’m glad that you shared this useful
info with us. Please keep us informed like this. Thank you for sharing.
Thank you for sharing your info. I truly appreciate your efforts and I am waiting for your next post thanks once again.
I think this is among the most important info for me.
And i’m glad reading your article. But wanna remark on few general things, The website
style is ideal, the articles is really excellent : D. Good job, cheers
Heya i am for the first time here. I came across this board and
I find It really useful & it helped me out much. I hope to give something back and help
others like you aided me.
Woah! I’m really enjoying the template/theme of this blog.
It’s simple, yet effective. A lot of times it’s challenging to get that “perfect balance” between usability and appearance.
I must say you’ve done a very good job with this. Also, the blog loads super quick for me on Firefox.
Superb Blog!