Linux/FreeBSD下用C语言开发PHP的so扩展模块例解

引用本文请注明出处:Just Do IT (http://www.toplee.com) < Michael Lee @ toplee.com >

我从97年接触互联网的web开发,至今已经过去9年了,从最初的frontpage做html页面到学会ASP+access+IIS开始,就跟web开发干上了,后来又依次使用了ASP+SQLServer+IIS、JSP+Oracle+Jrun(Resin/Tomcat)、PHP+Syabse(MySQL)+Apache … 最后我定格到了 PHP+MySQL+Apache+Linux(BSD) 的架构上,也就是大家常说的LAMP架构,这说来有很多理由,网上也有很多人讨论各种架构和开发语言之间的优劣,我就不多说了,简单说一下我喜欢LAMP的几个主要原因:

1、全开放的免费平台;
2、简单易上手、各种资源丰富;
3、PHP、MySQL、Apache与Linux(BSD)系统底层以及彼此间无缝结合,非常高效;
4、均使用最高效的语言C/C++开发,性能可靠;
5、PHP语言和C的风格基本一致,还吸取了Java和C++的诸多架构优点;
6、这是最关键的一点,那就是PHP可以非常方便的使用C/C++开发扩展模块,给了PHP无限的扩张性!

基于以上原因,我非常喜欢基于PHP语言的架构,其中最关键的一点就是最后一点,以前在Yahoo和mop均推广使用这个平台,在C扩展php方面也有一些经验,在此和大家分享一下,希望可以抛砖引玉。

用C语言编写PHP的扩展模块的方法有几种,根据最后的表现形式有两种,一种是直接编译进php,一种是编译为php的so扩展模块来被php调用,另外根据编译的方式有两种,一种使用phpize工具(php编译后有的),一种使用ext_skel工具(php自带的),我们使用最多,也是最方便的方式就是使用ext_skel工具来编写php的so扩展模块,这里也主要介绍这种方式。

我们在php的源码目录里面可以看到有个ext目录(我这里说的php都是基于Linux平台的php来说的,不包括windows下的),在ext目录下有个工具 ext_skel ,这个工具可以让我们简单的开发出php的扩展模块,它提供了一个通用的php扩展模块开发步骤和模板。下面我们以开发一个在php里面进行utf8/gbk/gb2312三种编码转换的扩展模块为例子进行说明。在这个模块中,我们要最终提供以下几个函数接口:

(1) string toplee_big52gbk(string s)
将输入字符串从BIG5码转换成GBK
(2) string toplee_gbk2big5(string s)
将输入字符串从GBK转换成BIG5码
(3) string toplee_normalize_name(string s)
将输入字符串作以下处理:全角转半角,strim,大写转小写
(4) string toplee_fan2jian(int code, string s)
将输入的GBK繁体字符串转换成简体
(5) string toplee_decode_utf(string s)
将utf编码的字符串转换成UNICODE
(6) string toplee_decode_utf_gb(string s)
将utf编码的字符串转换成GB
(7) string toplee_decode_utf_big5(string s)
将utf编码的字符串转换成BIG5
(8) string toplee_encode_utf_gb(string s)
将输入的GBKf编码的字符串转换成utf编码

首先,我们进入ext目录下,运行下面命令:
#./ext_skel –extname=toplee
这时,php会自动在ext目录下为我们生成一个目录toplee,里面包含下面几个文件
.cvsignore
CREDITS
EXPERIMENTAL
config.m4
php_toplee.h
tests
toplee.c
toplee.php

其中最有用的就是config.m4和toplee.c文件
接下来我们修改config.m4文件
#vi ./config.m4
找到里面有类似这样几行


dnl PHP_ARG_WITH(toplee, for toplee support,
dnl Make sure that the comment is aligned:
dnl [ –with-toplee Include toplee support])

dnl Otherwise use enable:

dnl PHP_ARG_ENABLE(toplee, whether to enable toplee support,
dnl Make sure that the comment is aligned:
dnl [ –enable-toplee Enable toplee support])

上面的几行意思是说告诉php编译的使用使用那种方式加载我们的扩展模块toplee,我们使用–with-toplee的方式,于是我们修改为下面的样子


PHP_ARG_WITH(toplee, for toplee support,
Make sure that the comment is aligned:
[ –with-toplee Include toplee support])

dnl Otherwise use enable:

dnl PHP_ARG_ENABLE(toplee, whether to enable toplee support,
dnl Make sure that the comment is aligned:
dnl [ –enable-toplee Enable toplee support])

然后我们要做的关键事情就是编写toplee.c,这个是我们编写模块的主要文件,如果您什么都不修改,其实也完成了一个php扩展模块的编写,里面有类似下面的几行代码

PHP_FUNCTION(confirm_toplee_compiled)
{
char *arg = NULL;
int arg_len, len;
char string[256];

if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, “s”, &arg, &arg_len) == FAILURE) {
return;
}

len = sprintf(string, “Congratulations! You have successfully modified ext/%.78s/config.m4. Module %.78s is now compiled into PHP.”, “toplee”, arg);
RETURN_STRINGL(string, len, 1);
}

如果我们在后面完成php的编译时把新的模块编译进去,那么我们就可以在php脚本中调用函数toplee(),它会输出一段字符串“Congratulations! You have successfully modified ext/toplee/config.m4. Module toplee is now compiled into PHP.”

下面是我们对toplee.c的修改,让其支持我们预先规划的功能和接口,下面是toplee.c的源代码

/*
+———————————————————————-+
| PHP Version 4 |
+———————————————————————-+
| Copyright (c) 1997-2002 The PHP Group |
+———————————————————————-+
| This source file is subject to version 2.02 of the PHP license, |
| that is bundled with this package in the file LICENSE, and is |
| available at through the world-wide-web at |
| http://www.php.net/license/2_02.txt. |
| If you did not receive a copy of the PHP license and are unable to |
| obtain it through the world-wide-web, please send a note to |
| license@php.net so we can mail you a copy immediately. |
+———————————————————————-+
| Author: |
+———————————————————————-+

$Id: header,v 1.10 2002/02/28 08:25:27 sebastian Exp $
*/

#ifdef HAVE_CONFIG_H
#include “config.h”
#endif

#include “php.h”
#include “php_ini.h”
#include “ext/standard/info.h”
#include “php_gbk.h”
#include “toplee_util.h”

/* If you declare any globals in php_gbk.h uncomment this:
ZEND_DECLARE_MODULE_GLOBALS(gbk)
*/

/* True global resources – no need for thread safety here */
static int le_gbk;

/* {{{ gbk_functions[]
*
* Every user visible function must have an entry in gbk_functions[].
*/
function_entry gbk_functions[] = {
PHP_FE(toplee_decode_utf, NULL)
PHP_FE(toplee_decode_utf_gb, NULL)
PHP_FE(toplee_decode_utf_big5, NULL)
PHP_FE(toplee_encode_utf_gb, NULL)

PHP_FE(toplee_big52gbk, NULL)
PHP_FE(toplee_gbk2big5, NULL)
PHP_FE(toplee_fan2jian, NULL)
PHP_FE(toplee_normalize_name, NULL)
{NULL, NULL, NULL} /* Must be the last line in gbk_functions[] */
};
/* }}} */

/* {{{ gbk_module_entry
*/
zend_module_entry gbk_module_entry = {
#if ZEND_MODULE_API_NO >= 20010901
STANDARD_MODULE_HEADER,
#endif
“gbk”,
gbk_functions,
PHP_MINIT(gbk),
PHP_MSHUTDOWN(gbk),
PHP_RINIT(gbk), /* Replace with NULL if there’s nothing to do at request start */
PHP_RSHUTDOWN(gbk), /* Replace with NULL if there’s nothing to do at request end */
PHP_MINFO(gbk),
#if ZEND_MODULE_API_NO >= 20010901
“0.1”, /* Replace with version number for your extension */
#endif
STANDARD_MODULE_PROPERTIES
};
/* }}} */

#ifdef COMPILE_DL_GBK
ZEND_GET_MODULE(gbk)
#endif

/* {{{ PHP_INI
*/
/* Remove comments and fill if you need to have entries in php.ini*/
PHP_INI_BEGIN()
PHP_INI_ENTRY(“gbk2uni”, “”, PHP_INI_SYSTEM, NULL)
PHP_INI_ENTRY(“uni2gbk”, “”, PHP_INI_SYSTEM, NULL)
PHP_INI_ENTRY(“uni2big5”, “”, PHP_INI_SYSTEM, NULL)
PHP_INI_ENTRY(“big52uni”, “”, PHP_INI_SYSTEM, NULL)
PHP_INI_ENTRY(“big52gbk”, “”, PHP_INI_SYSTEM, NULL)
PHP_INI_ENTRY(“gbk2big5”, “”, PHP_INI_SYSTEM, NULL)
// STD_PHP_INI_ENTRY(“gbk.global_value”, “42”, PHP_INI_ALL, OnUpdateInt, global_value, zend_gbk_globals, gbk_globals)
// STD_PHP_INI_ENTRY(“gbk.global_string”, “foobar”, PHP_INI_ALL, OnUpdateString, global_string, zend_gbk_globals, gbk_globals)
PHP_INI_END()

/* }}} */

/* {{{ php_gbk_init_globals
*/
/* Uncomment this function if you have INI entries
static void php_gbk_init_globals(zend_gbk_globals *gbk_globals)
{
gbk_globals->global_value = 0;
gbk_globals->global_string = NULL;
}
*/
/* }}} */

char gbk2uni_file[256];
char uni2gbk_file[256];
char big52uni_file[256];
char uni2big5_file[256];
char gbk2big5_file[256];
char big52gbk_file[256];

//utf file init flag
static int initutf=0;

/* {{{ PHP_MINIT_FUNCTION
*/
PHP_MINIT_FUNCTION(gbk)
{
/* If you have INI entries, uncomment these lines
ZEND_INIT_MODULE_GLOBALS(gbk, php_gbk_init_globals, NULL);*/
REGISTER_INI_ENTRIES();
memset(gbk2uni_file, 0, sizeof(gbk2uni_file));
memset(uni2gbk_file, 0, sizeof(uni2gbk_file));
memset(big52uni_file, 0, sizeof(big52uni_file));
memset(uni2big5_file, 0, sizeof(uni2big5_file));
memset(gbk2big5_file, 0, sizeof(gbk2big5_file));
memset(big52gbk_file, 0, sizeof(big52gbk_file));

strncpy(gbk2uni_file, INI_STR(“gbk2uni”), sizeof(gbk2uni_file)-1);
strncpy(uni2gbk_file, INI_STR(“uni2gbk”), sizeof(uni2gbk_file)-1);
strncpy(big52uni_file, INI_STR(“big52uni”), sizeof(big52uni_file)-1);
strncpy(uni2big5_file, INI_STR(“uni2big5”), sizeof(uni2big5_file)-1);
strncpy(gbk2big5_file, INI_STR(“gbk2big5”), sizeof(uni2big5_file)-1);
strncpy(big52gbk_file, INI_STR(“big52gbk”), sizeof(uni2big5_file)-1);

//InitMMResource();
InitResource();
if ((uni2gbk_file[0] == ‘\0’) || (uni2big5_file[0] == ‘\0’)
|| (gbk2big5_file[0] == ‘\0’) || (big52gbk_file[0] == ‘\0’)
|| (gbk2uni_file[0] == ‘\0’) || (big52uni_file[0] == ‘\0’))
{
return FAILURE;
}

if (gbk2uni_file[0] != ‘\0’)
{
if (LoadOneCodeTable(CODE_GBK2UNI, gbk2uni_file) != NULL)
{
toplee_cleanup_mmap(NULL);
return FAILURE;
}
}

if (uni2gbk_file[0] != ‘\0’)
{
if (LoadOneCodeTable(CODE_UNI2GBK, uni2gbk_file) != NULL)
{
toplee_cleanup_mmap(NULL);
return FAILURE;
}
}

if (big52uni_file[0] != ‘\0’)
{
if (LoadOneCodeTable(CODE_BIG52UNI, big52uni_file) != NULL)
{
toplee_cleanup_mmap(NULL);
return FAILURE;
}
}

if (uni2big5_file[0] != ‘\0’)
{
if (LoadOneCodeTable(CODE_UNI2BIG5, uni2big5_file) != NULL)
{
toplee_cleanup_mmap(NULL);
return FAILURE;
}
}

if (gbk2big5_file[0] != ‘\0’)
{
if (LoadOneCodeTable(CODE_GBK2BIG5, gbk2big5_file) != NULL)
{
toplee_cleanup_mmap(NULL);
return FAILURE;
}
}

if (big52gbk_file[0] != ‘\0’)
{
if (LoadOneCodeTable(CODE_BIG52GBK, big52gbk_file) != NULL)
{
toplee_cleanup_mmap(NULL);
return FAILURE;
}
}

initutf = 1;
return SUCCESS;
}
/* }}} */

/* {{{ PHP_MSHUTDOWN_FUNCTION
*/
PHP_MSHUTDOWN_FUNCTION(gbk)
{
/* uncomment this line if you have INI entries*/
UNREGISTER_INI_ENTRIES();

toplee_cleanup_mmap(NULL);
return SUCCESS;
}
/* }}} */

/* Remove if there’s nothing to do at request start */
/* {{{ PHP_RINIT_FUNCTION
*/
PHP_RINIT_FUNCTION(gbk)
{
return SUCCESS;
}
/* }}} */

/* Remove if there’s nothing to do at request end */
/* {{{ PHP_RSHUTDOWN_FUNCTION
*/
PHP_RSHUTDOWN_FUNCTION(gbk)
{
return SUCCESS;
}
/* }}} */

/* {{{ PHP_MINFO_FUNCTION
*/
PHP_MINFO_FUNCTION(gbk)
{
php_info_print_table_start();
php_info_print_table_header(2, “gbk support”, “enabled”);
php_info_print_table_end();

/* Remove comments if you have entries in php.ini*/
DISPLAY_INI_ENTRIES();

}
/* }}} */

/* Remove the following function when you have succesfully modified config.m4
so that your module can be compiled into PHP, it exists only for testing
purposes. */

/* {{{ proto toplee_decode_utf(string s)
*/
PHP_FUNCTION(toplee_decode_utf)
{
char *s = NULL, *t=NULL;
int argc = ZEND_NUM_ARGS();
int s_len;

if (zend_parse_parameters(argc TSRMLS_CC, “s”, &s, &s_len) == FAILURE)
return;

if (!initutf)
RETURN_FALSE
t = strdup(s);
if (t==NULL)
RETURN_FALSE

DecodePureUTF(t, KEEP_UNICODE);
RETVAL_STRING(t,1);
free(t);
return;
}
/* }}} */

/* {{{ proto toplee_decode_utf_gb(string s)
*/
PHP_FUNCTION(toplee_decode_utf_gb)
{
char *s = NULL, *t=NULL;
int argc = ZEND_NUM_ARGS();
int s_len;

if (zend_parse_parameters(argc TSRMLS_CC, “s”, &s, &s_len) == FAILURE)
return;

if (!initutf)
RETURN_FALSE
t = strdup(s);
if (t==NULL)
RETURN_FALSE

DecodePureUTF(t, DECODE_UNICODE);
RETVAL_STRING(t,1);
free(t);
return;

}
/* }}} */

/* {{{ proto toplee_decode_utf_big5(string s)
*/
PHP_FUNCTION(toplee_decode_utf_big5)
{
char *s = NULL, *t=NULL;
int argc = ZEND_NUM_ARGS();
int s_len;

if (zend_parse_parameters(argc TSRMLS_CC, “s”, &s, &s_len) == FAILURE)
return;

if (!initutf)
RETURN_FALSE
t = strdup(s);
if (t==NULL)
RETURN_FALSE

DecodePureUTF(t, DECODE_UNICODE | DECODE_BIG5);
RETVAL_STRING(t,1);
free(t);
return;
}
/* }}} */
int EncodePureUTF(unsigned char* strSrc,
unsigned char* strDst, int nDstLen, int nFlag)
{
int nRet;
int pos;
unsigned short c;
unsigned short* uBuf;
int nSize;
int nLen;
int nReturn;

nLen=strlen((const char*)strSrc);
if(nDstLen < nLen*2+1) return 0; nSize=nLen+1; uBuf=(unsigned short*)emalloc(sizeof(unsigned short)*nSize); nRet=MultiByteToWideChar(936, 0, (const char*)strSrc, strlen((const char*)strSrc), uBuf, nSize); nReturn=0; pos=nRet; while(pos>0)
{
c = *uBuf;
if (c < 0x80) { strDst[nReturn++] = (char) c; } else if (c < 0x800) { strDst[nReturn++] = (0xc0 | (c >> 6));
strDst[nReturn++] = (0x80 | (c & 0x3f));
} else if (c < 0x10000) { strDst[nReturn++] = (0xe0 | (c >> 12));
strDst[nReturn++] = (0x80 | ((c >> 6) & 0x3f));
strDst[nReturn++] = (0x80 | (c & 0x3f));
} else if (c < 0x200000) { strDst[nReturn++] = (0xf0 | (c >> 18));
strDst[nReturn++] = (0x80 | ((c >> 12) & 0x3f));
strDst[nReturn++] = (0x80 | ((c >> 6) & 0x3f));
strDst[nReturn++] = (0x80 | (c & 0x3f));
}
pos–;
uBuf++;
}
strDst[nReturn]=’\0′;

return nReturn;
}

/* {{{ proto toplee_encode_utf_gb(string s)
*/
PHP_FUNCTION(toplee_encode_utf_gb)
{
char *s = NULL;
int argc = ZEND_NUM_ARGS();
int s_len;
char* sRet;

if (zend_parse_parameters(argc TSRMLS_CC, “s”, &s, &s_len) == FAILURE)
return;

if (!initutf)
RETURN_FALSE
sRet=emalloc(strlen(s)*2+1);

EncodePureUTF(s, sRet, strlen(s)*2+1, 0);
RETVAL_STRING(sRet,1);
return;
}
/* }}} */

/* {{{ proto toplee_big52gbk(string s)
*/
PHP_FUNCTION(toplee_big52gbk)
{
char *s = NULL;
int argc = ZEND_NUM_ARGS();
int s_len;
char* sRet = NULL;

if (zend_parse_parameters(argc TSRMLS_CC, “s”, &s, &s_len) == FAILURE)
return;

if (!initutf)
RETURN_FALSE

sRet=estrdup(s);
if (NULL == sRet)
RETURN_FALSE

BIG52GBK(sRet, strlen(sRet));
RETVAL_STRING(sRet,1);
return;
}
/* }}} */

/* {{{ proto toplee_gbk2big5(string s)
*/
PHP_FUNCTION(toplee_gbk2big5)
{
char *s = NULL;
int argc = ZEND_NUM_ARGS();
int s_len;
char* sRet = NULL;

if (zend_parse_parameters(argc TSRMLS_CC, “s”, &s, &s_len) == FAILURE)
return;

if (!initutf)
RETURN_FALSE

sRet=estrdup(s);
if (NULL == sRet)
RETURN_FALSE

GBK2BIG5(sRet, strlen(sRet));
RETVAL_STRING(sRet,1);
return;
}
/* }}} */

/* {{{ proto toplee_normalize_name(string s)
*/
PHP_FUNCTION(toplee_normalize_name)
{
char *s = NULL;
int argc = ZEND_NUM_ARGS();
int s_len;
char* sRet = NULL;

if (zend_parse_parameters(argc TSRMLS_CC, “s”, &s, &s_len) == FAILURE)
return;

if (!initutf)
RETURN_FALSE

NormalizeName( s );

RETURN_STRING(s, 1 );

return;
}
/* }}} */

/* {{{ proto toplee_fan2jian(int code, string s)
*/
PHP_FUNCTION(toplee_fan2jian)
{
char *s = NULL;
int argc = ZEND_NUM_ARGS();
int s_len, code;
char* sRet = NULL;
char *pSource;
char *pDest1=NULL, *pDest2=NULL;
int nSourceLen, nDestLen;

if (zend_parse_parameters(argc TSRMLS_CC, “ls”, &code, &s, &s_len) == FAILURE)
return;

if (!initutf)
RETURN_FALSE

pSource = s;
nSourceLen = s_len;
pDest1 = malloc(nSourceLen * 2);
pDest2 = malloc(nSourceLen+1);
if (NULL == pDest1 || NULL == pDest2)
goto _f2j_err;

memset(pDest1, 0, nSourceLen * 2);
memset(pDest2, 0, nSourceLen + 1);
nDestLen = MultiByteToWideChar(code, 0, pSource, nSourceLen, (short *)pDest1, nSourceLen * 2);

if (0 >= nDestLen)
goto _f2j_err;

nDestLen = WideCharToMultiByte(code, 0, (short *)pDest1, nDestLen, pDest2, nSourceLen, NULL, NULL);
if (0 >= nDestLen)
goto _f2j_err;

RETVAL_STRING(pDest2, 1);
if (pDest1 != NULL)
free(pDest1);
if (pDest2 != NULL)
free(pDest2);
return;

_f2j_err:
if (pDest1 != NULL)
free(pDest1);
if (pDest2 != NULL)
free(pDest2);
RETURN_FALSE;
}
/* }}} */

/*
* Local variables:
* tab-width: 4
* c-basic-offset: 4
* End:
* vim600: noet sw=4 ts=4 fdm=marker
* vim<600: noet sw=4 ts=4 */

.

事实上我们在这个文件里面定义了所有我们要实现的接口,剩下的部分就是我们再编写几个具体实现的C语言代码,有关C具体实现的技术细节就不在此讨论,有个关键的大家注意就是,您可以在ext/toplee目录下加入您所有用于实现您在toplee.c里面定义的接口的C源文件和头文件,让toplee.c在编译的时候可以调用到,这些都是标准的C语言语法。Michael就不另说,下我把我们实现的几个代码都贴出来:
chn_util.h

#ifndef __CHN_UTIL_H__
#define __CHN_UTIL_H__

#include “common.h”

#define LANG_GB 1
#define LANG_B5 2

#define GB_FULL_COUNT (20+26*2+5+4+26)
#define B5_FULL_COUNT (20+26*2+5+4+24)

BOOL FullToHalf(char *str, int nLang);

void LowerString(char* str);

void TrimString(char* str);

#endif // __CHN_UTIL_H__

.

chn_util.c

#include
#include
#include
#include “common.h”
#include “chn_util.h”

// 0123456789 !@()-_+'<>
static char *GBFull[GB_FULL_COUNT] =
{“0”, “1”, “2”, “3”, “4”, “5”, “6”, “7”, “8”, “9”,
“ ”, “@”, “(”, “)”, “-”, “_”, “+”, “'”, “<”, “>”,
“a”, “b”, “c”, “d”, “e”, “f”, “g”, “h”, “i”, “j”, “k”,
“l”, “m”, “n”, “o”, “p”, “q”, “r”, “s”, “t”, “u”, “v”,
“w”, “x”, “y”, “z”, “A”, “B”, “C”, “D”, “E”, “F”, “G”,
“H”, “I”, “J”, “K”, “L”, “M”, “N”, “O”, “P”, “Q”, “R”,
“S”, “T”, “U”, “V”, “W”, “X”, “Y”, “Z”,
“。”, “·”, “.”, “﹒”, “&”,
“《”, “〈”, “〉”, “》”,
“﹐”, “,”, “﹔”, “;”, “﹕”, “:”, “﹖”, “?”, “﹗”, “!”, “—”,
“‘”, “’”, ““”, “””, “~”, “∶”, “`”, “|”, “[”, “]”, “{”,
“}”, “#”, “$”, “%”
};

static char GBEnHalf[GB_FULL_COUNT+1] =
“0123456789 @()-_+\’<>abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ”
“….&<<>>,,;;::\?\?!!-\’\’\”\”~:`|[]{}#$%”;

// ⒈⒉⒊⒋⒌⒍⒎⒏⌒∨∠ˇ≌≈
static char *B5Full[B5_FULL_COUNT] =
{“”, “”, “⒈”, “⒉”, “⒊”, “⒋”, “⒌”, “⒍”, “⒎”, “⒏”,
“”, “”, “”, “”, “⌒”, “∨”, “∠”, “ˇ”, “≌”, “≈”,
“㈤”, “㈥”, “㈦”, “㈧”, “㈨”, “㈩”, “”, “”, “Ⅰ”, “Ⅱ”, “Ⅲ”,
“Ⅳ”, “Ⅴ”, “Ⅵ”, “Ⅶ”, “Ⅷ”, “Ⅸ”, “Ⅹ”, “Ⅺ”, “Ⅻ”, “”, “”,
“”, “”, “”, “”, “⑾”, “⑿”, “⒀”, “⒁”, “⒂”, “⒃”, “⒄”,
“⒅”, “⒆”, “⒇”, “①”, “②”, “③”, “④”, “⑤”, “⑥”, “⑦”, “⑧”,
“⑨”, “⑩”, “”, “”, “㈠”, “㈡”, “㈢”, “㈣”,
“”, “”, “”, “”, “‘”,
“”, “”, “”, “”,
“”, “”, “”, “”, “”, “”, “”, “”, “”, “”, “”,
“ˉ”, “ˇ”, “¨”, “〃”, “°”, “”, “”, “”, “”, “”, “…”,
“”, “”
};

static char B5EnHalf[B5_FULL_COUNT+1] =
“0123456789 @()-_+\’<>abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ”
“….&<<>>,,;;::\?\?!!-\’\’\”\”~|[]{}#$%”;

static int _bFHSortFlag=0;

static void _sorttable(char* tableFull[], char* tableHalf, int nSize)
{
int i,j;
char* p;
char cTemp;

for(i=0;i 0)
nHigh=nMid-1;
else
nLow=nMid+1;
}

if( !bContinue )
{
// 判断其他符号
if( ( 0xA1 <= (BYTE)*pSrc ) && ( 0xA9 >= (BYTE)*pSrc ) )
{
*pDest++ = ‘ ‘;
pSrc+=2;
bContinue=TRUE;
}
}

/* for (nIndex = 0; nIndex < nCount; nIndex++) { assert(NULL != pFull[nIndex]); if (NULL != pFull[nIndex]) { if (0 == strncmp(pSrc, pFull[nIndex], 2)) { *pDest++ = pEnHalf[nIndex]; // convert full to half pSrc += 2; bContinue = TRUE; break; } } }*/ if (bContinue) { bContinue = FALSE; continue; } *pDest++ = *pSrc++; // copy head char, and the next statement copy tail char if(*pSrc == '\0') break; } *pDest++ = *pSrc++; // ascii code } *pDest = '\0'; return TRUE; } BOOL MyIsDBCSLeadByte(BYTE TestChar) { if((TestChar>0X80) && (TestChar<0xFF)) return TRUE; else return FALSE; } void LowerString(char* str) { while(*str) { if(!MyIsDBCSLeadByte(*str)) { if( (*str>=’A’) && (*str<='Z') ) *str = (char)(*str+('a'-'A')); } else { str++; if(!*str) break; } str++; } return ; } BOOL myisspace(char c) { return ((c==' ') || (c=='\t') || (c=='\r') || (c=='\n')); } void TrimString(char* str) { char* pDst; char* pSrc; char* pLast; char cCurrent; int nState; pLast=pDst=pSrc=str; nState=0; while(*pSrc) { cCurrent=*pSrc; switch(nState) { case 0: if(!myisspace(cCurrent)) { nState=1; continue; } break; case 1: if(myisspace(cCurrent)) { nState=2; *pDst=cCurrent; } else { *pDst=cCurrent; pLast=pDst+1; } pDst++; break; case 2: if(myisspace(cCurrent)) { *pDst=cCurrent; } else { *pDst=cCurrent; pLast=pDst+1; } pDst++; break; } pSrc++; } *pLast='\0'; return; }

.

toplee_util.c

……

int ToBase64(void* pSrc,int nSrcLen, char* strBase64, int* nBase64Len)
{
static char *v = “ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/”;

………. 中间代码有长达3000多行,本文省略掉了 ……..

void NormalizeName( char *p )
{
FullToHalf( p, CODE_PAGE_GBK );
TrimString( p );
LowerString( p );
}


.

toplee_util.h

#ifndef __TOPLEE_UTIL_INCLUDE__
#define __TOPLEE_UTIL_INCLUDE__ 1

#include
#include
#include
#include
#include
#ifdef LINUX
#include
#endif

#include “common.h”

//#include “euc2uni.h”

/*
typedef int BOOL;
*/
#ifndef TRUE
#define TRUE 1
#define FALSE 0
#endif

#define ASCII 0
#define HZ_HEAD 1
#define HZ_TAIL 2

#ifdef BIG_ENDDING
#define DEFAULT_UNICODE 0x3000
#define DEFAULT_GBK_CODE 0xA1A1
#define DEFAULT_BIG5_CODE 0xA140
#else
#define DEFAULT_UNICODE 0x0030
#define DEFAULT_GBK_CODE 0xA1A1
#define DEFAULT_BIG5_CODE 0x40A1
#endif

#define CODE_PAGE_GBK 936
#define CODE_PAGE_BIG5 950
#define CODE_PAGE_EUC 932

#define CHARSET_DEFAULT 0
#define CHARSET_UNICODE 1
#define CHARSET_UTF8 2

// 24066 = ( 0xFE – 0x81 + 1 ) * ( 0xFE – 0x40 + 1)
#define GBK_COUNT 24066

// 16999 = ( 0xF9 – 0xA1 + 1 ) * ( 0xFE – 0x40 + 1)
#define BIG5_COUNT 16999

typedef struct tagMMapFile2
{
BOOL bUsed;
struct stat finfo;
void *mm;
} MMapFile;

//int LoadEuc2UniTable(char *strFileName);
//void FreeEuc2UniTable(void);

int ToBase64(void* pSrc,int nSrcLen, char* strBase64, int* nBase64Len);
int FromBase64(char* strSrc, int nSrcLen, void* pDest, int* nDestLen);
int htmlencode(char* strInput, int nInputLen, char* strOutBuf, int nOutBufLen);

int MultiByteToWideChar(unsigned int uCodePage, unsigned long lFlags,
char *pMultiByteStr, int nMultiByte,
unsigned short *pWideChar, int nWideChar);
int WideCharToMultiByte(unsigned int uCodePage, unsigned long dwFlags,
unsigned short *pWideCharStr, int nWideChar,
char *pMultiByteStr, int nMultiByte,
const char* lpDefaultChar, int* lpUseDefaultChar);

#define ASCII 0
#define HZ_HEAD 1
#define HZ_TAIL 2

void GBK2BIG5(char *lpString, int cbString);
void BIG52GBK(char *lpString, int cbString);

void LowerString(char *str);
void TrimString(char *str);
void DecodeFormString(char *str);
void DecodeUTF(char *str);

#define DECODE_UNICODE 0
#define KEEP_UNICODE 1

#define DECODE_GBK 0
#define DECODE_BIG5 2

int DecodePureUTF(unsigned char *str, int nFlag);

#define LANG_GB 1 // used by httpstrtoint and FullToHalf
#define LANG_B5 2
#define LANG_ENG 3
#define LANG_UNKNOWN 4

int httpstrtoint(char* strHttp);
void lowerhttpprefix(char* strUrl);

#define FULL_COUNT (21+26*2+5)

BOOL FullToHalf(char *str, int nLang);

#define URLDESCSEPCHAR ‘|’
char* DescriptFromUrl(char* strUrl);

#define CODE_GBK2UNI 1
#define CODE_UNI2GBK 2
#define CODE_BIG52UNI 3
#define CODE_UNI2BIG5 4
#define CODE_GBK2BIG5 5
#define CODE_BIG52GBK 6

const char *mmapOneFile(char *pFileName, MMapFile *mmapfile);
void toplee_cleanup_mmap(void *dummy);
void InitMMResource(void);
const char* LoadOneCodeTable(int nType, char* strFileName);

int getcuryear();

char* mstrncpy(char* strDest, char* strSrc, size_t nCount);
int formurlencode(char* strInput, int nInputLen, char* strOutBuf, int nOutBufLen);

int wmlencode(char* strInput, int nInputLen, char* strOutBuf, int nOutBufLen);
int htmlencode(char* strInput, int nInputLen, char* strOutBuf, int nOutBufLen);

#define MAX_INTERNAL_BUFF 16384
int gb2uni_encode(char* strInput, int nInputLen, char* strOutBuf, int nOutBufLen);
int unicodeencode(char* strInput, int nInputLen, char* strOutBuf, int nOutBufLen);

char *stristr(const char *big, const char *little);

typedef struct auto_string
{
int len, inc_len;
char *strval;
}struAutoString;
#define DEF_INC_LEN (1024)
#define DEF_INT_LEN 12

void init_auto_string(struAutoString *astr, int inc_len);
int add_auto_string(struAutoString *astr, char *new_str);
void free_auto_string(struAutoString *astr);

int unistrcmp(const char *str1, int str1len, const char *str2, int str2len);

void NormalizeName( char *p );

#endif // __TOPLEE_UTIL_INCLUDE__

.

php_toplee.h

/*
+———————————————————————-+
| PHP Version 4 |
+———————————————————————-+
| Copyright (c) 1997-2002 The PHP Group |
+———————————————————————-+
| This source file is subject to version 2.02 of the PHP license, |
| that is bundled with this package in the file LICENSE, and is |
| available at through the world-wide-web at |
| http://www.php.net/license/2_02.txt. |
| If you did not receive a copy of the PHP license and are unable to |
| obtain it through the world-wide-web, please send a note to |
| license@php.net so we can mail you a copy immediately. |
+———————————————————————-+
| Author: |
+———————————————————————-+

$Id: header,v 1.10 2002/02/28 08:25:27 sebastian Exp $
*/

#ifndef PHP_GBK_H
#define PHP_GBK_H

extern zend_module_entry gbk_module_entry;
#define phpext_gbk_ptr &gbk_module_entry

#ifdef PHP_WIN32
#define PHP_GBK_API __declspec(dllexport)
#else
#define PHP_GBK_API
#endif

#ifdef ZTS
#include “TSRM.h”
#endif

PHP_MINIT_FUNCTION(gbk);
PHP_MSHUTDOWN_FUNCTION(gbk);
PHP_RINIT_FUNCTION(gbk);
PHP_RSHUTDOWN_FUNCTION(gbk);
PHP_MINFO_FUNCTION(gbk);

PHP_FUNCTION(confirm_gbk_compiled); /* For testing, remove later. */

PHP_FUNCTION(toplee_decode_utf);
PHP_FUNCTION(toplee_decode_utf_gb);
PHP_FUNCTION(toplee_decode_utf_big5);
PHP_FUNCTION(toplee_encode_utf_gb);

PHP_FUNCTION(toplee_big52gbk);
PHP_FUNCTION(toplee_gbk2big5);
PHP_FUNCTION(toplee_fan2jian);
PHP_FUNCTION(toplee_normalize_name);

/*
Declare any global variables you may need between the BEGIN
and END macros here:

ZEND_BEGIN_MODULE_GLOBALS(gbk)
int global_value;
char *global_string;
ZEND_END_MODULE_GLOBALS(gbk)
*/

/* In every utility function you add that needs to use variables
in php_gbk_globals, call TSRM_FETCH(); after declaring other
variables used by that function, or better yet, pass in TSRMLS_CC
after the last function argument and declare your utility function
with TSRMLS_DC after the last declared argument. Always refer to
the globals in your function as GBK_G(variable). You are
encouraged to rename these macros something shorter, see
examples in any other php module directory.
*/

#ifdef ZTS
#define GBK_G(v) TSRMG(gbk_globals_id, zend_gbk_globals *, v)
#else
#define GBK_G(v) (gbk_globals.v)
#endif

#endif /* PHP_GBK_H */

/*
* Local variables:
* tab-width: 4
* c-basic-offset: 4
* indent-tabs-mode: t
* End:
*/

.

至此,我们完成了所有C 代码的编写,本模块实现还需要用到几个码表文件,比如gb2b5.tab,uni2gb.tab之类的,这些码表文件我就不提供了,可以查一些文档如何生成,网上也有很多这样的tab码表文件下载。

接下来,我们就可以进行测试和编译了

回到php源码的根目录,运行命令
#./buildconf
#./configure –with-toplee=shared ……
#./make
#./make install

此时,就完成了模块往php里面的编译,由于加上了shared参数,toplee模块将编译后生成 toplee.so,可以在php.ini或者extensions.ini文件里面使用extension=toplee.so来调用,也可以在php中使用dl()函数动态调用,然后就可以在php里面使用之前我们定义好的几个函数接口了。

因Michael技术实力有限,本文有不正确之处请高手指正,也希望通过本文起到抛砖引玉之效果,让更多的php爱好者一起来分享个人的宝贵经验!

179 thoughts on “Linux/FreeBSD下用C语言开发PHP的so扩展模块例解”

  1. 我对python没有什么研究,基本不太了解,但是知道这是个很不错的语言,有机会想了解一下。
    php代码和html的分离已经在基于php的架构中广泛使用,著名的有编译到php里面的smarty模板引擎,还要用php来写的PHPLib模板引擎等,这些都可以轻松的实现html和php代码的分离,比如我现在用的这个wordpress Blog程序就是应用了模板引擎,可以随意的更换皮肤(风格)和更换界面的语言版本。

  2. 看了你的几篇文章,写的挺不错的,尤其是”说说大型高并发高负载网站的系统架构”,与我们的服务系统采用的方案比较类似,收获不少, 在此表示感谢:).

  3. 呵呵,笔误,当时脑子里面想到了早期研究过的一个PHP的Cache引擎,好像是俄罗斯人写的一个编译到php里面的cache引擎。smarty不用编译,也没有编译版本的,如果有一天php语言可以强到可以使用php本身编写扩展的时候,就能把smarty编译进去了。。。:)

  4. 请教个问题: 如果C那边给我提供了一个xxx.so文件,我用php调用xxx.so的接口,且传递参数过去, 这个该怎么做。
    或者需要把C代码都放到xxx.c里面去.

  5. 俊哥你好,我今天第一次您的bolg,感觉很不错。
    由于工作需要开始做php。看到有smarty的讨论,我说两句,smarty是个很重要的技术,我刚从java转到php的时候要不是看到它,早就崩溃了(页面和代码混合编写)。
    虽然这篇文章是06年的,但相关内容我最近才开始研究,打算自己试试,遇到问题还请赐教。

  6. I absolutely love your blog.. Very nice colors & theme.

    Did you build this web site yourself? Please reply back as I’m
    attempting to create my own personal blog and would love
    to learn where you got this from or exactly what the
    theme is named. Thanks!

  7. With havin so much content do you ever run into any problems of plagorism or
    copyright violation? My website has a lot of exclusive content I’ve either authored myself or outsourced but it seems a lot of it is popping it up all over the web without my agreement.
    Do you know any techniques to help stop content from being stolen? I’d really appreciate it.

  8. Please let me know if you’re looking for a writer for your site.
    You have some really great posts and I believe I would be a good
    asset. If you ever want to take some of the load off, I’d really like to write some articles for
    your blog in exchange for a link back to mine.
    Please shoot me an email if interested. Many thanks!

  9. Write more, thats all I have to say. Literally, it seems as though you relied on the video to make your point.
    You definitely know what youre talking about, why waste your
    intelligence on just posting videos to your weblog when you could be giving us something enlightening to
    read?

  10. I truly wanted to develop a brief remark so as to express gratitude to you for the awesome tips and hints you are posting on this site. My long internet research has finally been recognized with pleasant content to share with my colleagues. I ‘d declare that many of us visitors are very much fortunate to exist in a superb place with many marvellous individuals with very helpful tricks. I feel very privileged to have encountered your weblog and look forward to plenty of more fabulous minutes reading here. Thank you again for all the details.

  11. Somebody necessarily assist to make severely articles I
    might state. This is the first time I frequented your web page and thus far?
    I surprised with the analysis you made to create this actual post amazing.
    Wonderful activity!

  12. Please let me know if you’re looking for a author for
    your site. You have some really great articles and I think I would be a good asset.
    If you ever want to take some of the load off, I’d love to write
    some material for your blog in exchange for a link back to mine.
    Please send me an e-mail if interested. Cheers!

  13. Have you ever thought about including a little bit more than just your articles?
    I mean, what you say is fundamental and everything. Nevertheless imagine if you added some great photos or videos to give your posts more,
    “pop”! Your content is excellent but with images and clips, this blog could certainly be
    one of the very best in its field. Excellent blog!

  14. What you wrote was very reasonable. However, think on this, suppose you wrote a catchier title?
    I ain’t suggesting your content is not good, but what if you added
    a post title that makes people want more? I mean Linux/FreeBSD下用C语言开发PHP的so扩展模块例解 | 李俊麟的平凡生活 is a little vanilla.
    You ought to peek at Yahoo’s home page and watch how they create post headlines to grab people to open the links.
    You might add a related video or a related picture or two to grab people interested about what
    you’ve written. Just my opinion, it might make your posts a
    little bit more interesting.

  15. First of all I want to say great blog! I had a quick question in which I’d like to ask if you do not mind.

    I was interested to know how you center yourself and clear your head prior to
    writing. I have had a tough time clearing my mind in getting my ideas out there.
    I truly do take pleasure in writing however it just
    seems like the first 10 to 15 minutes tend to be lost simply just trying to figure out how to begin. Any suggestions or hints?

    Thanks!

  16. Hi! This is kind of off topic but I need some advice from an established blog.

    Is it tough to set up your own blog? I’m not very techincal but I can figure
    things out pretty quick. I’m thinking about making my own but I’m not sure where to begin. Do you have any ideas or suggestions?

    Many thanks

  17. Simply want to say your article is as amazing. The clarity in your post is simply nice and i could assume you’re an expert on this subject.
    Well with your permission allow me to grab your RSS feed to
    keep up to date with forthcoming post. Thanks a million and please keep up the gratifying work.

  18. Please let me know if you’re looking for a writer for your site.

    You have some really good posts and I think I would be
    a good asset. If you ever want to take some of the load off,
    I’d love to write some articles for your blog in exchange for a link
    back to mine. Please send me an e-mail if interested.
    Kudos!

  19. Hello there! This is kind of off topic but I need some advice from an established blog.
    Is it hard to set up your own blog? I’m not very
    techincal but I can figure things out pretty quick.
    I’m thinking about creating my own but I’m not sure where to
    start. Do you have any tips or suggestions? Thanks

  20. Unquestionably imagine that that you stated. Your favourite justification seemed to be at the web the simplest thing to be aware of.

    I say to you, I definitely get irked whilst other people consider issues that they plainly do not understand about.
    You managed to hit the nail upon the highest as neatly as defined out the whole
    thing without having side effect , people can take a signal.

    Will probably be back to get more. Thank you

  21. It is perfect time to make a few plans for the future and it’s time to be happy.
    I’ve read this submit and if I could I desire to counsel you few interesting issues or advice.
    Perhaps you can write subsequent articles regarding this article.
    I wish to learn even more things approximately it!

  22. Pingback: jordan 12
  23. Pingback: air jordan shoes
  24. Pingback: nike air max
  25. Pingback: nike air max 2017
  26. Pingback: jordan 6
  27. Pingback: cheap jordans
  28. This is thee perfect webpage for anyne who
    wishes too understand this topic. You realize a whole lot its
    almost tough to argue with you (not that Ipersonally would want to…HaHa).
    You definiely put a new spin on a subject which has
    been written about for decades. Wonderful stuff, just wonderful!

  29. Pingback: jordan shoes
  30. Hi there, I found your web site by way of Google whilst looking for
    a comparable matter, your site got here up,
    it seems good. I’ve bookmarked it in my google bookmarks.

    Hello there, simply was alert to your weblog
    through Google, and located that it is truly informative.
    I am going to watch out for brussels. I’ll appreciate should you proceed this in future.

    A lot of other folks will be benefited from your writing.
    Cheers!

  31. Howdy! This blog post couldn’t be written any better! Reading through this article reminds me of my previous roommate!
    He constantly kept preaching about this. I’ll send this information to him.

    Pretty sure he’s going to have a very good read. I appreciate you
    for sharing!

  32. Good day I am so excited I found your web site, I really found
    you by mistake, while I was looking on Digg for
    something else, Regardless I am here now and would just like to say thanks a lot
    for a remarkable post and a all round exciting blog (I also love
    the theme/design), I don’t have time to read through it all at
    the moment but I have saved it and also included your RSS feeds, so when I have
    time I will be back to read much more, Please do keep up the great work.

  33. Its such as you learn my mind! You appear to understand a lot about
    this, like you wrote the ebook in it or something.
    I believe that you simply could do with a few p.c. to force the message home
    a little bit, however other than that, this is fantastic blog.
    A fantastic read. I’ll certainly be back.

  34. My programmer is trying to convince me to move to .net from PHP.
    I have always disliked the idea because of the expenses.
    But he’s tryiong none the less. I’ve been using
    WordPress on a variety of websites for about a year and am worried about switching to another platform.
    I have heard great things about blogengine.net. Is there a way I can transfer all my wordpress posts into it?
    Any kind of help would be really appreciated!

  35. Nice post. I was checking continuously this blog and I am impressed!
    Extremely useful info specially the last part 🙂 I care for such
    info much. I was looking for this particular information for a very long
    time. Thank you and best of luck.

  36. Hey would you mind sharing which blog platform you’re working with?
    I’m looking to start my own blog in the near future but I’m having a tough time making a decision between BlogEngine/Wordpress/B2evolution and Drupal.

    The reason I ask is because your layout seems different then most
    blogs and I’m looking for something completely unique.
    P.S My apologies for being off-topic but I had
    to ask!

  37. Pingback: suport lansete

Leave a Reply

Your email address will not be published. Required fields are marked *

*
To prove you're a person (not a spam script), type the security word shown in the picture. Click on the picture to hear an audio file of the word.
Anti-spam image