Coreseek®  
 | 首页 | 注册 | 回复 | 搜索 | 统计资料 |                 网站首页产品服务开放源码安装使用常见问题中文手册社区交流联系我们 
全文检索 论坛首页 / 全文检索 /

buildexpcerpt问题

 
leelgl
会员
#1 | 发表时间: 2009 06 18 15:00 | 修改: leelgl
回复 
我查询数字的时候就可以显示高亮
但是查询文字的时候要不就是不高亮 要不就是高亮乱码 整个高亮区间都是乱码
exact_phrase=0 n=1, res= ... to be highlighted 抄?/b>颐? and for the ... n=2, res=another 抄?/b>颐?test 我们 to be highlighted, below limit n=3, res=test number three 我们 , without phrase match n=4, res=final test, 我们 not only without phrase match, but also above ... exact_phrase=1 n=1, res= ... to be highlighted 抄?/b>颐? and for the ... n=2, res=another 抄?/b>颐?test 我们 to be highlighted, below limit n=3, res=test number three 我们 , without phrase match n=4, res=final test, 我们 not only without phrase match, but also above ...
这个高亮词是抄袭
HonestQiao
会员
#2 | 发表时间: 2009 06 19 00:44
回复 
你的文本原文是什么?
leelgl
会员
#3 | 发表时间: 2009 06 20 12:56
回复 
<?php

//
// $Id: test2.php 910 2007-11-16 11:43:46Z shodan $
//

require ( "sphinxapi.php" );

$docs = array
(
    "this is my test 我们 to be highlighted 抄袭我们, and for the sake of the testing we need to pump its length somewhat",
    "another 抄袭我们 test 我们 to be highlighted, below limit",
    "test number three 我们 , without phrase match",
    "final test, 我们 not only without phrase match, but also above limit and with swapped phrase text test as well",
);
$words = "抄袭";
//$cm=new GB2312UTF8();
//$words = $cm->GB2312TOUTF8($words);
//echo $words;
$index = "test1";
$opts = array
(
    "before_match"        => "<b>",
    "after_match"        => "</b>",
    "chunk_separator"    => " ... ",
    "limit"                => 60,
    "around"            => 3,
);

foreach ( array(0,1) as $exact )
{
    $opts["exact_phrase"] = $exact;
    print "exact_phrase=$exact\n";

    $cl = new SphinxClient ();
    $res = $cl->BuildExcerpts ( $docs, $index, $words, $opts );
    if ( !$res )
    {
        die ( "ERROR: " . $cl->GetLastError() . ".\n" );
    } else
    {
        $n = 0;
        foreach ( $res as $entry )
        {
            $n++;
            print "n=$n, res=$entry\n";
        }
        print "\n";
    }
}

//
// $Id: test2.php 910 2007-11-16 11:43:46Z shodan $
//

?>
HonestQiao
会员
#4 | 发表时间: 2009 06 20 22:44
回复 
以下代码:
<?php
// $Id: test2.php 910 2007-11-16 11:43:46Z shodan $
require ( "api/sphinxapi.php" );

$docs = array
( "this is my test 我们 to be highlighted 抄袭我们, and for the sake of the testing we need to pump its length somewhat",
    "another 抄袭我们 test 我们 to be highlighted, below limit",
    "test number three 我们 , without phrase match",
    "final test, 我们 not only without phrase match, but also above limit and with swapped phrase text test as well",
    );
$words = "抄袭";
// $cm=new GB2312UTF8();
// $words = $cm->GB2312TOUTF8($words);
// echo $words;
$index = "test_search";
$opts = array
( "before_match" => "<b>",
    "after_match" => "</b>",
    "chunk_separator" => " ... ",
    "limit" => 60,
    "around" => 3,
    );

foreach ( array( 0, 1 ) as $exact )
{
    $opts["exact_phrase"] = $exact;
    print "exact_phrase=$exact\n";

    $cl = new SphinxClient ();
    $cl->SetServer( '127.0.0.1', 3312 );
    $res = $cl->BuildExcerpts ( $docs, $index, $words, $opts );
    if ( !$res )
    {
        die ( "ERROR: " . $cl->GetLastError() . ".\n" );
    }
    else
    {
        $n = 0;
        foreach ( $res as $entry )
        {
            $n++;
            print iconv('UTF-8','GBK',"n=$n, res=$entry\n");
        }
        print "\n";
    }
}

//
// $Id: test2.php 910 2007-11-16 11:43:46Z shodan $
//

?>

测试结果:
---------- PHP5 代码调试 ----------
exact_phrase=0
n=1, res=this is my test 我们 to be highlighted 抄袭我们, and for the sake ...
n=2, res=another <b>抄袭</b>我们 test 我们 to be highlighted, below limit
n=3, res=test number three 我们 , without phrase match
n=4, res=final test, 我们 not only without phrase match, but also  ...

exact_phrase=1
n=1, res=this is my test 我们 to be highlighted 抄袭我们, and for the sake ...
n=2, res=another <b>抄袭</b>我们 test 我们 to be highlighted, below limit
n=3, res=test number three 我们 , without phrase match
n=4, res=final test, 我们 not only without phrase match, but also  ...


输出完毕 (耗时 0 秒) - 正常终止


请注意:
我的php文件的字符集为UTF-8

你那边是否字符集不正确?
leelgl
会员
#5 | 发表时间: 2009 06 21 11:09
回复 
我把你这个复制上去结果是
exact_phrase=0 n=1, res= ... to be highlighted n=2, res=another n=3, res=test number three n=4, res=final test, exact_phrase=1 n=1, res= ... to be highlighted n=2, res=another n=3, res=test number three n=4, res=final test,
代码如下
<?php
// $Id: test2.php 910 2007-11-16 11:43:46Z shodan $
require ( "sphinxapi.php" );

$docs = array
( "this is my test 我们 to be highlighted 抄袭我们, and for the sake of the testing we need to pump its length somewhat",
"another 抄袭我们 test 我们 to be highlighted, below limit",
"test number three 我们 , without phrase match",
"final test, 我们 not only without phrase match, but also above limit and with swapped phrase text test as well",
);
$words = "抄袭";
// $cm=new GB2312UTF8();
// $words = $cm->GB2312TOUTF8($words);
// echo $words;
$index = "test1";
$opts = array
( "before_match" => "<b>",
"after_match" => "</b>",
"chunk_separator" => " ... ",
"limit" => 60,
"around" => 3,
);

foreach ( array( 0, 1 ) as $exact )
{
$opts["exact_phrase"] = $exact;
print "exact_phrase=$exact\n";

$cl = new SphinxClient ();
$cl->SetServer( '127.0.0.1', 3312 );
$res = $cl->BuildExcerpts ( $docs, $index, $words, $opts );
if ( !$res )
{
die ( "ERROR: " . $cl->GetLastError() . ".\n" );
}
else
{
$n = 0;
foreach ( $res as $entry )
{
$n++;
print iconv('UTF-8','GBK',"n=$n, res=$entry\n");
}
print "\n";
}
}

//
// $Id: test2.php 910 2007-11-16 11:43:46Z shodan $
//

?>
leelgl
会员
#6 | 发表时间: 2009 06 21 11:10
回复 
和PHP4,php5有关系么 我是php4的
HonestQiao
会员
#7 | 发表时间: 2009 06 21 11:50
回复 
请注意:
我的php文件的字符集为UTF-8

你那边是否字符集不正确?
leelgl
会员
#8 | 发表时间: 2009 06 22 12:50
回复 
请问下那要是gbk的字符集 怎么解决啊?
HonestQiao
会员
#9 | 发表时间: 2009 07 09 23:05
回复 
可以使用iconv转码
wgbbiao
会员
#10 | 发表时间: 2010 02 23 12:51 | 修改: wgbbiao
回复 
顶下,我也有问题。。 该高亮的不高亮。


我的解决了,其实不是什么问题,是我没搞懂分词。
 
回复
Bold Style  Italic Style  Image 链接  URL 链接 
发帖注意:
  • 网址中请去掉http://开头,例如:您需要输入www.coreseek.cn,而不是http://www.coreseek.cn
  • 咨询问题,请贴出详细的操作系统版本、Coreseek版本(Linux环境请给出编译参数)
  • 请仔细查看中文手册和本站安装指南,确认操作正确
  • 请仔细查看常见问题解答,也许你的问题已经有解决方法

» 帐号  » 密码 
发帖前请登陆, 或者 注册 .