转自http://blog.csdn.net/qiaofangjie/article/details/18042407
本人的需求是:
通过theano的cnn训练神经网络,将最终稳定的网络权值保存下来。SEO靠我c++实现cnn的前向计算过程,读取theano的权值,复现theano的测试结果
本人最终的成果是:
1、卷积神经网络的前向计算过程
2、mlp网络的前向与后向计算,也就是可以用来训练样本
需要注意的是:
如SEO靠我果为了复现theano的测试结果,那么隐藏层的激活函数要选用tanh;
否则,为了mlp的训练过程,激活函数要选择sigmoid
成果的展现:
下图是theano的训练以及测试结果,验证样本错误率为9.23SEO靠我%
下面是我的c++程序,验证错误率也是9.23%,完美复现theano的结果
简单讲难点有两个:
1.theano的权值以及测试样本与c++如何互通?
2.theano的卷积的时候,上层输入的featureSEO靠我map如何组合,映射到本层的每个像素点上?
在解决上述两点的过程中,走了很多的弯路:
为了用c++重现theano的测试结果,必须让c++能够读取theano保存的权值以及测试样本。 SEO靠我 思考分析如下: 1.theano的权值是numpy格式,而它直接与c++交互,很困难,numpy的格式不好解析,网上资料很少 2.采用pythSEO靠我on做中间转换,实现1)的要求。后看theano代码,发现读入python的训练样本,不用转换成numpy数组,用本来python就可以了。但是python经过cPickle的dump文件,加了很多格SEO靠我式,不适合同c++交互。 3. 用json转换,由于python和cpp都有json的接口,都转成json的格式,然后再交互。可是theano训练之后权值是numpy格式的,SEO靠我需要转成python数组,json才可以存到文件中。现在的问题是怎么把numpy转成python的list? 4.为了解决3,找了一天,终于找到了numpy数组的tolistSEO靠我接口,可以将numpy数组转换成python的list。 5.现在python和c++都可以用json了。研究jsoncpp库的使用,将python的json文件读取。通过测试发现,库 jsoncppSEO靠我不适合读取大文件,很容易造成内存不足,效率极低,故不可取。 6.用c++写函数,自己解析json文件。并且通过pot文件生成训练与测试样本的时候,也直接用c++来生成,不需要转换成numpy数组的格式SEO靠我。 经过上述分析,解决了难点1。 通过json格式实现c++与theano权值与测试样本的互通,并且自己写函数解析json文件。
对于难点2,看一个典型的cnn网络图
难点2的详细描述如下:
Theano从SEO靠我S2到C3的时候,如何选择S2的featuremap进行组合?每次固定选取还是根据一定的算法动态组合?Theano从C3到S4的pooling过程,令poolsize是(2*2),如何将C3的每4个像SEO靠我素变成S4的一个像素? 通过大量的分析,对比验证,发现以下结论: Theano从S2到C3的时候,选择S2的所有featuremap进行组合Theano从C3到S4的poolSEO靠我ing过程,令poolsize是(2*2),,对于C3的每4个像素,选取最大值作为S4的一个像素 通过以上的分析,理论上初步已经弄清楚了。下面就是要根据理论编写代码,真正耗时的是代码的调试过程,总是复SEO靠我现不了theano的测试结果。 曾经不止一次的认为这是不可能复现的,鬼知道theano怎么搞的。 今天终于将代码调通,很是兴奋,于是有了这篇博客。 阻碍我实现结果的bug主要有两个,一个是理论上的不足SEO靠我,对theano卷积的细节把握不准确;一个是自己写代码时粗心,变量初始化错误。如下: S2到C3卷积时,theano会对卷积核旋转180度之后,才会像下图这样进行卷积(本人刚接触这块,实在是不知道啊。SEO靠我。。)
C3到S4取像素最大值的时候,想当然认为像素都是正的,变量初始化为0,导致最终找最大值错误(这个bug找的时间最久,血淋淋的教训。。。)
theano对写权值的函数,注意它保存的是卷积核旋转180SEO靠我度后的权值,如果权值是二维的,那么行列互换(与c++的权值表示法统一)[python]view plain copy def getDataJson(layers): data = [] SEO靠我 i = 0 for layer in layers: w, b = layer.params # print ..layer is, iSEO靠我 w, b = w.get_value(), b.get_value() wshape = w.shape # print ...the sSEO靠我hape of w is, wshape if len(wshape) == 2: w = w.transpose() else: SEO靠我 for k in xrange(wshape[0]): for j in xrange(wshape[1]): SEO靠我 w[k][j] = numpy.rot90(w[k][j], 2) w = w.reshape((wshape[0], numpy.prod(wshapSEO靠我e[1:]))) w = w.tolist() b = b.tolist() data.append([w, b]) iSEO靠我 += 1 return data def writefile(data, name = ../../tmp/src/data/theanocnn.json): printSEO靠我 (writefile is + name) f = open(name, "wb") json.dump(data,f) f.close()
theano读权值 [SEO靠我python]view plain copy def readfile(layers, nkerns, name = ../../tmp/src/data/theanocnn.json): SEO靠我 # Load the dataset print (readfile is + name) f = open(name, rb) data = json.loaSEO靠我d(f) f.close() readwb(data, layers, nkerns) def readwb(data, layers, nkerns): i SEO靠我= 0 kernSize = len(nkerns) inputnum = 1 for layer in layers: w, b = dataSEO靠我[i] w = numpy.array(w, dtype=float32) b = numpy.array(b, dtype=float32) SEO靠我 # print ..layer is, i # print w.shape if i >= kernSize: w = w.trSEO靠我anspose() else: w = w.reshape((nkerns[i], inputnum, 5, 5)) forSEO靠我 k in xrange(nkerns[i]): for j in xrange(inputnum): c = w[k]SEO靠我[j] w[k][j] = numpy.rot90(c, 2) inputnum = nkerns[i] #SEO靠我 print ..readwb ,transpose and rot180 # print w.shape layer.W.set_value(w, borroSEO靠我w=True) layer.b.set_value(b, borrow=True) i += 1
测试样本由 手写数字库mnist生成,核心代码如下: [pythoSEO靠我n]view plain copy def mnist2json_small(cnnName = mnist_small.json, validNumber = 10): dataset SEO靠我= ../../data/mnist.pkl print ... loading data, dataset # Load the dataset f = openSEO靠我(dataset, rb) train_set, valid_set, test_set = cPickle.load(f) #print test_set f.cSEO靠我lose() def np2listSmall(train_set, number): trainfile = [] trains, labels SEO靠我= train_set trainfile = [] #如果注释掉下面,将生成number个验证样本 number = len(labelsSEO靠我) for one in trains[:number]: one = one.tolist() trainfile.appSEO靠我end(one) labelfile = labels[:number].tolist() datafile = [trainfile, labelfile] SEO靠我 return datafile smallData = valid_set print len(smallData) valid, validlSEO靠我abel = np2listSmall(smallData, validNumber) datafile = [valid, validlabel] basedir = ../SEO靠我../tmp/src/data/ # basedir = ./ json.dump(datafile, open(basedir + cnnName, wb))
个人收获: 面对SEO靠我较难的任务,逐步分解,各个击破解决问题的过程中,如果此路不通,要马上寻找其它思路,就像当年做数学证明题一样态度要积极,不要轻言放弃,尽全力完成任务代码调试时,应该首先构造较为全面的测试用例,这样可以迅SEO靠我速定位bug
本人的需求以及实现时的困难已经基本描述清楚,如果还有别的小问题,我相信大家花点比俺少很多很多 的时间就可以解决,下面开始贴代码
如果不想自己建工程,这里有vs2008的c++代码,自己按照tSEO靠我heano生成一下权值就可以读入运行了
C++代码
main.cpp [cpp]view plain copy #include #include "mlp.h" #include "util.h" SEO靠我#include "testinherit.h" #include "neuralNetwork.h" using namespace std; /********************SEO靠我****************************************************/ /* 本程序实现了 1、卷积神经网络的前向计算过程 2、mlp网络的前向与后向计算SEO靠我,也就是可以用来训练样本 需要注意的是: 如果为了复现theano的测试结果,那么隐藏层的激活函数要选用tanh; 否则,为了mlp的训练过程,激活函数要选择sigmoid */ /***SEO靠我*********************************************************************/ int main() { cout <SEO靠我< "****cnn****" << endl; TestCnnTheano(28 * 28, 10); // TestMlpMnist对mlp训练样本进行测试 /SEO靠我/TestMlpMnist(28 * 28, 500, 10); return 0; }
neuralNetwork.h
[cpp]view plain copy #ifndef NEURSEO靠我ALNETWORK_H #define NEURALNETWORK_H #include "mlp.h" #include "cnn.h" #includeusing std::vecSEO靠我tor; /************************************************************************/ /* 这是一个卷积神经网络 SEO靠我 */ /****************************SEO靠我********************************************/ class NeuralNetWork { public: NeuralNetWorSEO靠我k(int iInput, int iOut); ~NeuralNetWork(); void Predict(double** in_data, int n); SEO靠我double CalErrorRate(const vector &vecvalid, const vector&vecValidlabel); void Setwb(vector< vecSEO靠我tor > &vvAllw, vector< vector> &vvAllb); void SetTrainNum(int iNum); int Predict(double *SEO靠我pInputData); // void Forward_propagation(double** ppdata, int n); double* Forward_proSEO靠我pagation(double *); private: int m_iSampleNum; //样本数量 int m_iInput; //输入维数 int mSEO靠我_iOut; //输出维数 vectorvecCnns; Mlp *m_pMlp; }; void TestCnnTheano(const int iInput, cSEO靠我onst int iOut); #endif
neuralNetwork.cpp
[cpp]view plain copy #include "neuralNetwork.h" #include #SEO靠我include "util.h" #includeusing namespace std; NeuralNetWork::NeuralNetWork(int iInput, int iOut)SEO靠我:m_iSampleNum(0), m_iInput(iInput), m_iOut(iOut), m_pMlp(NULL) { int iFeatureMapNumber = 20,SEO靠我 iPoolWidth = 2, iInputImageWidth = 28, iKernelWidth = 5, iInputImageNumber = 1; CnnLayer *pCnSEO靠我nLayer = new CnnLayer(m_iSampleNum, iInputImageNumber, iInputImageWidth, iFeatureMapNumber, iKernelWSEO靠我idth, iPoolWidth); vecCnns.push_back(pCnnLayer); iInputImageNumber = 20; iInputImaSEO靠我geWidth = 12; iFeatureMapNumber = 50; pCnnLayer = new CnnLayer(m_iSampleNum, iInputImageSEO靠我Number, iInputImageWidth, iFeatureMapNumber, iKernelWidth, iPoolWidth); vecCnns.push_back(pCnnSEO靠我Layer); const int ihiddenSize = 1; int phidden[ihiddenSize] = {500}; // construct SEO靠我LogisticRegression m_pMlp = new Mlp(m_iSampleNum, iFeatureMapNumber * 4 * 4, m_iOut, ihiddenSiSEO靠我ze, phidden); } NeuralNetWork::~NeuralNetWork() { for (vector::iterator it = vecCnns.begSEO靠我in(); it != vecCnns.end(); ++it) { delete *it; } delete m_pMlp; } voSEO靠我id NeuralNetWork::SetTrainNum(int iNum) { m_iSampleNum = iNum; for (size_t i = 0; i < SEO靠我vecCnns.size(); ++i) { vecCnns[i]->SetTrainNum(iNum); } m_pMlp->SetTrainSEO靠我Num(iNum); } int NeuralNetWork::Predict(double *pdInputdata) { double *pdPredictData = NSEO靠我ULL; pdPredictData = Forward_propagation(pdInputdata); int iResult = -1; iResult =SEO靠我 m_pMlp->m_pLogisticLayer->Predict(pdPredictData); return iResult; } double* NeuralNetWorkSEO靠我::Forward_propagation(double *pdInputData) { double *pdPredictData = pdInputData; vectSEO靠我or::iterator it; CnnLayer *pCnnLayer; for (it = vecCnns.begin(); it != vecCnns.end(); ++SEO靠我it) { pCnnLayer = *it; pCnnLayer->Forward_propagation(pdPredictData); SEO靠我 pdPredictData = pCnnLayer->GetOutputData(); } //此时pCnnLayer指向最后一个卷积层,pdInputData是卷SEO靠我积层的最后输出 //暂时忽略mlp的前向计算,以后加上 pdPredictData = m_pMlp->Forward_propagation(pdPredictData); SEO靠我 return pdPredictData; } void NeuralNetWork::Setwb(vector< vector > &vvAllw, vector< vector>SEO靠我 &vvAllb) { for (size_t i = 0; i < vecCnns.size(); ++i) { vecCnns[i]->Setwb(SEO靠我vvAllw[i], vvAllb[i]); } size_t iLayerNum = vvAllw.size(); for (size_t i = vecCnnsSEO靠我.size(); i < iLayerNum - 1; ++i) { int iHiddenIndex = 0; m_pMlp->m_ppHiddeSEO靠我nLayer[iHiddenIndex]->Setwb(vvAllw[i], vvAllb[i]); ++iHiddenIndex; } m_pMlp->mSEO靠我_pLogisticLayer->Setwb(vvAllw[iLayerNum - 1], vvAllb[iLayerNum - 1]); } double NeuralNetWork::CaSEO靠我lErrorRate(const vector &vecvalid, const vector&vecValidlabel) { cout << "Predict------------SEO靠我" << endl; int iErrorNumber = 0, iValidNumber = vecValidlabel.size(); //iValidNumber = 1SEO靠我; for (int i = 0; i < iValidNumber; ++i) { int iResult = Predict(vecvalid[i]);SEO靠我 //cout << i << ",valid is " << iResult << " label is " << vecValidlabel[i] << endl; SEO靠我 if (iResult != vecValidlabel[i]) { ++iErrorNumber; } } SEO靠我 cout << "the num of error is " << iErrorNumber << endl; double dErrorRate = (double)iErrSEO靠我orNumber / iValidNumber; cout << "the error rate of Train sample by softmax is " << setprecisiSEO靠我on(10) << dErrorRate * 100 << "%" << endl; return dErrorRate; } /*************************SEO靠我***********************************************/ /* 测试样本采用mnist库,此cnn的结构与theano教程上的一致,即 输入是28*2SEO靠我8图像,接下来是2个卷积层(卷积+pooling),featuremap个数分别是20和50, 然后是全连接层(500个神经元),最后输出层10个神经元 */ /***************SEO靠我*********************************************************/ void TestCnnTheano(const int iInput, coSEO靠我nst int iOut) { //构建卷积神经网络 NeuralNetWork neural(iInput, iOut); //存取theano的权值 SEO靠我 vector< vector > vvAllw; vector< vector > vvAllb; //存取测试样本与标签 vectorvecValid; SEO靠我 vectorvecLabel; //保存theano权值与测试样本的文件 const char *szTheanoWeigh = "../../data/theanocnSEO靠我n.json", *szTheanoTest = "../../data/mnist_validall.json"; //将每次权值的第二维(列宽)保存到vector中,用于读取json文SEO靠我件 vectorvecSecondDimOfWeigh; vecSecondDimOfWeigh.push_back(5 * 5); vecSecondDimOfWSEO靠我eigh.push_back(20 * 5 * 5); vecSecondDimOfWeigh.push_back(50 * 4 * 4); vecSecondDimOfWeiSEO靠我gh.push_back(500); cout << "loadwb ---------" << endl; //读取权值 LoadWeighFromJson(vvSEO靠我Allw, vvAllb, szTheanoWeigh, vecSecondDimOfWeigh); //将权值设置到cnn中 neural.Setwb(vvAllw, vvASEO靠我llb); //读取测试文件 LoadTestSampleFromJson(vecValid, vecLabel, szTheanoTest, iInput)SEO靠我; //设置测试样本的总量 int iVaildNum = vecValid.size(); neural.SetTrainNum(iVaildNum); SEO靠我 //前向计算cnn的错误率,输出结果 neural.CalErrorRate(vecValid, vecLabel); //释放测试文件所申请的空间 for (SEO靠我vector::iterator cit = vecValid.begin(); cit != vecValid.end(); ++cit) { delete [](*SEO靠我cit); } }
cnn.h
[cpp]view plain copy #ifndef CNN_H #define CNN_H #include "featuremap.h" SEO靠我 #include "poollayer.h" #includeusing std::vector; typedef unsigned short WORD; /** *本卷积模拟theSEO靠我ano的测试过程 *当输入层是num个featuremap时,本层卷积层假设有featureNum个featuremap。 *对于本层每个像素点选取,上一层num个featuremap一起组合,并SEO靠我且没有bias *然后本层输出到pooling层,pooling只对poolsize内的像素取最大值,然后加上bias,总共有featuremap个bias值 */ class CnnLayeSEO靠我r { public: CnnLayer(int iSampleNum, int iInputImageNumber, int iInputImageWidth, int iFeaSEO靠我tureMapNumber, int iKernelWidth, int iPoolWidth); ~CnnLayer(); void Forward_prSEO靠我opagation(double *pdInputData); void Back_propagation(double* , double* , double ); voidSEO靠我 Train(double *x, WORD y, double dLr); int Predict(double *); void Setwb(vector&vpdw, veSEO靠我ctor&vdb); void SetInputAllData(double **ppInputAllData, int iInputNum); void SetTrainNuSEO靠我m(int iSampleNumber); void PrintOutputData(); double* GetOutputData(); private: SEO靠我int m_iSampleNum; FeatureMap *m_pFeatureMap; PoolLayer *m_pPoolLayer; //反向传播时所需值 SEO靠我 double **m_ppdDelta; double *m_pdInputData; double *m_pdOutputData; }; void TesSEO靠我tCnn(); #endif // CNN_H
cnn.cpp [cpp]view plain copy #include "cnn.h" #include "util.h" #includeSEO靠我CnnLayer::CnnLayer(int iSampleNum, int iInputImageNumber, int iInputImageWidth, int iFeatureMapNumbeSEO靠我r, int iKernelWidth, int iPoolWidth): m_iSampleNum(iSampleNum), m_pdInputSEO靠我Data(NULL), m_pdOutputData(NULL) { m_pFeatureMap = new FeatureMap(iInputImageNumber, iInputISEO靠我mageWidth, iFeatureMapNumber, iKernelWidth); int iFeatureMapWidth = iInputImageWidth - iKerneSEO靠我lWidth + 1; m_pPoolLayer = new PoolLayer(iFeatureMapNumber, iPoolWidth, iFeatureMapWidth); }SEO靠我 CnnLayer::~CnnLayer() { delete m_pFeatureMap; delete m_pPoolLayer; } void CnnLaSEO靠我yer::Forward_propagation(double *pdInputData) { m_pFeatureMap->Convolute(pdInputData); SEO靠我 m_pPoolLayer->Convolute(m_pFeatureMap->GetFeatureMapValue()); m_pdOutputData = m_pPoolLayer->SEO靠我GetOutputData(); /************************************************************************/ SEO靠我 /* 调试卷积过程的各阶段结果,调通后删除 */ SEO靠我 /************************************************************************/ /*m_pFeatureMap-SEO靠我>PrintOutputData(); m_pPoolLayer->PrintOutputData();*/ } void CnnLayer::SetInputAllData(douSEO靠我ble **ppInputAllData, int iInputNum) { } double* CnnLayer::GetOutputData() { assert(NUSEO靠我LL != m_pdOutputData); return m_pdOutputData; } void CnnLayer::Setwb(vector &vpdw, vector&vSEO靠我db) { m_pFeatureMap->SetWeigh(vpdw); m_pPoolLayer->SetBias(vdb); } void CnnLayer::SEO靠我SetTrainNum( int iSampleNumber ) { m_iSampleNum = iSampleNumber; } void CnnLayer::PrintOSEO靠我utputData() { m_pFeatureMap->PrintOutputData(); m_pPoolLayer->PrintOutputData(); } SEO靠我 void TestCnn() { const int iFeatureMapNumber = 2, iPoolWidth = 2, iInputImageWidth = 8, iKeSEO靠我rnelWidth = 3, iInputImageNumber = 2; double *pdImage = new double[iInputImageWidth * iInputImSEO靠我ageWidth * iInputImageNumber]; double arrInput[iInputImageNumber][iInputImageWidth * iInputImaSEO靠我geWidth]; MakeCnnSample(arrInput, pdImage, iInputImageWidth, iInputImageNumber); double SEO靠我*pdKernel = new double[3 * 3 * iInputImageNumber]; double arrKernel[3 * 3 * iInputImageNumber]SEO靠我; MakeCnnWeigh(pdKernel, iInputImageNumber) ; CnnLayer cnn(3, iInputImageNumber, iInputISEO靠我mageWidth, iFeatureMapNumber, iKernelWidth, iPoolWidth); vector vecWeigh; vector vecBias; SEO靠我 for (int i = 0; i < iFeatureMapNumber; ++i) { vecBias.push_back(1.0); } SEO靠我 vecWeigh.push_back(pdKernel); for (int i = 0; i < 3 * 3 * 2; ++i) { arrKSEO靠我ernel[i] = i; } vecWeigh.push_back(arrKernel); cnn.Setwb(vecWeigh, vecBias); SEO靠我 cnn.Forward_propagation(pdImage); cnn.PrintOutputData(); delete []pdKernel; deletSEO靠我e []pdImage; }
featuremap.h
[cpp]view plain copy #ifndef FEATUREMAP_H #define FEATUREMAP_H #inclSEO靠我ude #include using std::vector; class FeatureMap { public: FeatureMap(int iInputImageNumbeSEO靠我r, int iInputImageWidth, int iFeatureMapNumber, int iKernelWidth); ~FeatureMap(); void FSEO靠我orward_propagation(double* ); void Back_propagation(double* , double* , double ); void CSEO靠我onvolute(double *pdInputData); int GetFeatureMapSize() { return m_iFeatureMapSSEO靠我ize; } int GetFeatureMapWidth() { return m_iFeatureMapWidth; } SEO靠我 double* GetFeatureMapValue() { assert(m_pdOutputValue != NULL); return SEO靠我m_pdOutputValue; } void SetWeigh(const vector&vecWeigh); void PrintOutputData(); SEO靠我 double **m_ppdWeigh; double *m_pdBias; private: int m_iInputImageNumber; inSEO靠我t m_iInputImageWidth; int m_iInputImageSize; int m_iFeatureMapNumber; int m_iFeatuSEO靠我reMapWidth; int m_iFeatureMapSize; int m_iKernelWidth; // double m_dBias; douSEO靠我ble *m_pdOutputValue; }; #endif // FEATUREMAP_H
featuremap.cpp
[cpp]view plain copy #include "featSEO靠我uremap.h" #include "util.h" #includeFeatureMap::FeatureMap(int iInputImageNumber, int iInputImagSEO靠我eWidth, int iFeatureMapNumber, int iKernelWidth): m_iInputImageNumber(iInputImageNumber), SEO靠我 m_iInputImageWidth(iInputImageWidth), m_iFeatureMapNumber(iFeatureMapNumber), m_iKerneSEO靠我lWidth(iKernelWidth) { m_iFeatureMapWidth = m_iInputImageWidth - m_iKernelWidth + 1; mSEO靠我_iInputImageSize = m_iInputImageWidth * m_iInputImageWidth; m_iFeatureMapSize = m_iFeatureMapWSEO靠我idth * m_iFeatureMapWidth; int iKernelSize; iKernelSize = m_iKernelWidth * m_iKernelWidtSEO靠我h; double dbase = 1.0 / m_iInputImageSize; srand((unsigned)time(NULL)); m_ppdWeighSEO靠我 = new double*[m_iFeatureMapNumber]; m_pdBias = new double[m_iFeatureMapNumber]; for (inSEO靠我t i = 0; i < m_iFeatureMapNumber; ++i) { m_ppdWeigh[i] = new double[m_iInputImageNumSEO靠我ber * iKernelSize]; for (int j = 0; j < m_iInputImageNumber * iKernelSize; ++j) SEO靠我{ m_ppdWeigh[i][j] = uniform(-dbase, dbase); } //m_pdBias[i] = uniSEO靠我form(-dbase, dbase); //theano的卷积层貌似没有用到bias,它在pooling层使用 m_pdBias[i] = 0; SEO靠我} m_pdOutputValue = new double[m_iFeatureMapNumber * m_iFeatureMapSize]; // m_dBias = uniSEO靠我form(-dbase, dbase); } FeatureMap::~FeatureMap() { delete []m_pdOutputValue; deletSEO靠我e []m_pdBias; for (int i = 0; i < m_iFeatureMapNumber; ++i) { delete []m_ppdWeSEO靠我igh[i]; } delete []m_ppdWeigh; } void FeatureMap::SetWeigh(const vector&vecWeigh) SEO靠我{ assert(vecWeigh.size() == (DWORD)m_iFeatureMapNumber); for (int i = 0; i < m_iFeatureMSEO靠我apNumber; ++i) { delete []m_ppdWeigh[i]; m_ppdWeigh[i] = vecWeigh[i]; SEO靠我 } } /* 卷积计算 pdInputData:一维向量,包含若干个输入图像 */ void FeatureMap::Convolute(double *pdInputData)SEO靠我 { for (int iMapIndex = 0; iMapIndex < m_iFeatureMapNumber; ++iMapIndex) { dSEO靠我ouble dBias = m_pdBias[iMapIndex]; //每一个featuremap for (int i = 0; i < m_iFeaturSEO靠我eMapWidth; ++i) { for (int j = 0; j < m_iFeatureMapWidth; ++j) SEO靠我 { double dSum = 0.0; int iInputIndex, iKernelIndex, iInputIndexSEO靠我Start, iKernelStart, iOutIndex; //输出向量的索引计算 iOutIndex = iMapIndeSEO靠我x * m_iFeatureMapSize + i * m_iFeatureMapWidth + j; //分别计算每一个输入图像 SEO靠我 for (int k = 0; k < m_iInputImageNumber; ++k) { //与kernel对SEO靠我应的输入图像的起始位置 //iInputIndexStart = k * m_iInputImageSize + j * m_iInputImageWidtSEO靠我h + i; iInputIndexStart = k * m_iInputImageSize + i * m_iInputImageWidth + j; SEO靠我 //kernel的起始位置 iKernelStart = k * m_iKernelWidth * m_iKerSEO靠我nelWidth; for (int m = 0; m < m_iKernelWidth; ++m) { SEO靠我 for (int n = 0; n < m_iKernelWidth; ++n) { SEO靠我 //iKernelIndex = iKernelStart + n * m_iKernelWidth + m; SEO靠我 iKernelIndex = iKernelStart + m * m_iKernelWidth + n; //i am nSEO靠我ot sure, is the expression of below correct? iInputIndex = iInputIndexSEO靠我Start + m * m_iInputImageWidth + n; dSum += pdInputData[iInputIndex] *SEO靠我 m_ppdWeigh[iMapIndex][iKernelIndex]; }//end n }//enSEO靠我d m }//end k //加上偏置 //dSum += dBias; SEO靠我 m_pdOutputValue[iOutIndex] = dSum; }//end j }//end i }/SEO靠我/end iMapIndex } void FeatureMap::PrintOutputData() { for (int i = 0; i < m_iFeatureMaSEO靠我pNumber; ++i) { cout << "featuremap " << i <
poollayer.h
[cpp]view plain copy #ifndeSEO靠我f POOLLAYER_H #define POOLLAYER_H #includeusing std::vector; class PoolLayer { public: SEO靠我 PoolLayer(int iOutImageNumber, int iPoolWidth, int iFeatureMapWidth); ~PoolLayer(); vSEO靠我oid Convolute(double *pdInputData); void SetBias(const vector&vecBias); double* GetOutpuSEO靠我tData(); void PrintOutputData(); private: int m_iOutImageNumber; int m_iPoolWidtSEO靠我h; int m_iFeatureMapWidth; int m_iPoolSize; int m_iOutImageEdge; int m_iOutISEO靠我mageSize; double *m_pdOutData; double *m_pdBias; }; #endif // POOLLAYER_H
poollayer.cSEO靠我pp
[cpp]view plain copy #include "poollayer.h" #include "util.h" #include PoolLayer::PoolLayer(intSEO靠我 iOutImageNumber, int iPoolWidth, int iFeatureMapWidth): m_iOutImageNumber(iOutImageNumber), SEO靠我 m_iPoolWidth(iPoolWidth), m_iFeatureMapWidth(iFeatureMapWidth) { m_iPoolSize = m_SEO靠我iPoolWidth * m_iPoolWidth; m_iOutImageEdge = m_iFeatureMapWidth / m_iPoolWidth; m_iOutImSEO靠我ageSize = m_iOutImageEdge * m_iOutImageEdge; m_pdOutData = new double[m_iOutImageNumber * m_iOSEO靠我utImageSize]; m_pdBias = new double[m_iOutImageNumber]; /*for (int i = 0; i < m_iOutImagSEO靠我eNumber; ++i) { m_pdBias[i] = 1; }*/ } PoolLayer::~PoolLayer() { deleSEO靠我te []m_pdOutData; delete []m_pdBias; } void PoolLayer::Convolute(double *pdInputData) { SEO靠我 int iFeatureMapSize = m_iFeatureMapWidth * m_iFeatureMapWidth; for (int iOutImageIndex =SEO靠我 0; iOutImageIndex < m_iOutImageNumber; ++iOutImageIndex) { double dBias = m_pdBias[SEO靠我iOutImageIndex]; for (int i = 0; i < m_iOutImageEdge; ++i) { for (SEO靠我int j = 0; j < m_iOutImageEdge; ++j) { double dValue = 0.0; SEO靠我 int iInputIndex, iInputIndexStart, iOutIndex; /*************************SEO靠我***********************************************/ /* 这里是最大的bug,dMaxPixel初始值设置为0,然后找SEO靠我最大值 ** 问题在于像素值有负数,导致后面一系列计算错误,实在是太难找了 /***************************SEO靠我*********************************************/ double dMaxPixel = INT_MIN ; SEO靠我 iOutIndex = iOutImageIndex * m_iOutImageSize + i * m_iOutImageEdge + j; SEO靠我 iInputIndexStart = iOutImageIndex * iFeatureMapSize + (i * m_iFeatureMapWidth + j) * m_iPoolWidth;SEO靠我 for (int m = 0; m < m_iPoolWidth; ++m) { SEO靠我for (int n = 0; n < m_iPoolWidth; ++n) { // SEO靠我 int iPoolIndex = m * m_iPoolWidth + n; //i am not sure, the expreSEO靠我ssion of below is correct? iInputIndex = iInputIndexStart + m * m_iFeatureSEO靠我MapWidth + n; if (pdInputData[iInputIndex] > dMaxPixel) SEO靠我 { dMaxPixel = pdInputData[iInputIndex]; SEO靠我 } }//end n }//end m dValue = dMaxPixelSEO靠我 + dBias; assert(iOutIndex < m_iOutImageNumber * m_iOutImageSize); SEO靠我 //m_pdOutData[iOutIndex] = (dMaxPixel); m_pdOutData[iOutIndex] = mytanh(dValue)SEO靠我; }//end j }//end i }//end iOutImageIndex } void PoolLayer::SetBiaSEO靠我s(const vector&vecBias) { assert(vecBias.size() == (DWORD)m_iOutImageNumber); for (intSEO靠我 i = 0; i < m_iOutImageNumber; ++i) { m_pdBias[i] = vecBias[i]; } } doubleSEO靠我* PoolLayer::GetOutputData() { assert(NULL != m_pdOutData); return m_pdOutData; } SEO靠我void PoolLayer::PrintOutputData() { for (int i = 0; i < m_iOutImageNumber; ++i) { SEO靠我 cout << "pool image " << i <
mlp.h
[cpp]view plain copy #ifndef MLP_H #define MLP_H #incluSEO靠我de "hiddenLayer.h" #include "logisticRegression.h" class Mlp { public: Mlp(int n, int SEO靠我n_i, int n_o, int nhl, int *hls); ~Mlp(); // void Train(double** in_data, double** inSEO靠我_label, double dLr, int epochs); void Predict(double** in_data, int n); void Train(doublSEO靠我e *x, WORD y, double dLr); void TrainAllSample(const vector &vecTrain, const vector&vectrainlabSEO靠我el, double dLr); double CalErrorRate(const vector &vecvalid, const vector&vecValidlabel); SEO靠我 void Writewb(const char *szName); void Readwb(const char *szName); void Setwb(vector< vSEO靠我ector > &vvAllw, vector< vector> &vvAllb); void SetTrainNum(int iNum); int Predict(doubleSEO靠我 *pInputData); // void Forward_propagation(double** ppdata, int n); double* Forward_pSEO靠我ropagation(double *); int* GetHiddenSize(); int GetHiddenNumber(); double *GetHiddSEO靠我enOutputData(); HiddenLayer **m_ppHiddenLayer; LogisticRegression *m_pLogisticLayer; pSEO靠我rivate: int m_iSampleNum; //样本数量 int m_iInput; //输入维数 int m_iOut; //输出维数 intSEO靠我 m_iHiddenLayerNum; //隐层数目 int* m_piHiddenLayerSize; //中间隐层的大小 e.g. {3,4}表示有两个隐层,第一个有三个节点,第二个有SEO靠我4个节点 }; void mlp(); void TestMlpTheano(const int m_iInput, const int ihidden, const int m_iOutSEO靠我); void TestMlpMnist(const int m_iInput, const int ihidden, const int m_iOut); #endif
mlp.cpp
[cppSEO靠我]view plain copy #include #include "mlp.h" #include "util.h" #include #include using namespace std;SEO靠我 const int m_iSamplenum = 8, innode = 3, outnode = 8; Mlp::Mlp(int n, int n_i, int n_o, int nhl,SEO靠我 int *hls) { m_iSampleNum = n; m_iInput = n_i; m_iOut = n_o; m_iHiddenLayeSEO靠我rNum = nhl; m_piHiddenLayerSize = hls; //构造网络结构 m_ppHiddenLayer = new HiddenLayer*SEO靠我 [m_iHiddenLayerNum]; for(int i = 0; i < m_iHiddenLayerNum; ++i) { if(i == 0) SEO靠我 { m_ppHiddenLayer[i] = new HiddenLayer(m_iInput, m_piHiddenLayerSize[i]);//第SEO靠我一个隐层 } else { m_ppHiddenLayer[i] = new HiddenLayer(m_piHSEO靠我iddenLayerSize[i-1], m_piHiddenLayerSize[i]);//其他隐层 } } if (m_iHiddenLayerNum SEO靠我> 0) { m_pLogisticLayer = new LogisticRegression(m_piHiddenLayerSize[m_iHiddenLayerNSEO靠我um - 1], m_iOut, m_iSampleNum);//最后的softmax层 } else { m_pLogisticLayer =SEO靠我 new LogisticRegression(m_iInput, m_iOut, m_iSampleNum);//最后的softmax层 } } Mlp::~Mlp() { SEO靠我 //二维指针分配的对象不一定是二维数组 for(int i = 0; i < m_iHiddenLayerNum; ++i) delete m_ppHiddSEO靠我enLayer[i]; //删除的时候不能加[] delete[] m_ppHiddenLayer; //log_layer只是一个普通的对象指针,不能作为数组delete SEO靠我 delete m_pLogisticLayer;//删除的时候不能加[] } void Mlp::TrainAllSample(const vector&vecTrain, conSEO靠我st vector&vectrainlabel, double dLr) { cout << "Mlp::TrainAllSample" << endl; for (intSEO靠我 j = 0; j < m_iSampleNum; ++j) { Train(vecTrain[j], vectrainlabel[j], dLr); } SEO靠我 } void Mlp::Train(double *pdTrain, WORD usLabel, double dLr) { // cout << "******pdLaSEO靠我bel****" << endl; // printArrDouble(ppdinLabel, m_iSampleNum, m_iOut); double *pdLabel SEO靠我= new double[m_iOut]; MakeOneLabel(usLabel, pdLabel, m_iOut); //前向传播阶段 for(int n =SEO靠我 0; n < m_iHiddenLayerNum; ++ n) { if(n == 0) //第一个隐层直接输入数据 { SEO靠我 m_ppHiddenLayer[n]->Forward_propagation(pdTrain); } else //其他隐层用前一层的输出作为输入数据 SEO靠我 { m_ppHiddenLayer[n]->Forward_propagation(m_ppHiddenLayer[n-1]->m_pdOutdata);SEO靠我 } } //softmax层使用最后一个隐层的输出作为输入数据 m_pLogisticLayer->Forward_propagation(mSEO靠我_ppHiddenLayer[m_iHiddenLayerNum-1]->m_pdOutdata); //反向传播阶段 m_pLogisticLayer->Back_propaSEO靠我gation(m_ppHiddenLayer[m_iHiddenLayerNum-1]->m_pdOutdata, pdLabel, dLr); for(int n = m_iHiddenSEO靠我LayerNum-1; n >= 1; --n) { if(n == m_iHiddenLayerNum-1) { m_SEO靠我ppHiddenLayer[n]->Back_propagation(m_ppHiddenLayer[n-1]->m_pdOutdata, m_pLogisSEO靠我ticLayer->m_pdDelta, m_pLogisticLayer->m_ppdW, m_pLogisticLayer->m_iOut, dLr); } SEO靠我 else { double *pdInputData; pdInputData = m_ppHiddenLayer[n-1SEO靠我]->m_pdOutdata; m_ppHiddenLayer[n]->Back_propagation(pdInputData, SEO靠我 m_ppHiddenLayer[n+1]->m_pdDelta, m_ppHiddenLayer[n+1]->m_ppdW, m_ppHidSEO靠我denLayer[n+1]->m_iOut, dLr); } } //这里该怎么写? if (m_iHiddenLayerNum > 1) SEO靠我 m_ppHiddenLayer[0]->Back_propagation(pdTrain, SEO靠我m_ppHiddenLayer[1]->m_pdDelta, m_ppHiddenLayer[1]->m_ppdW, m_ppHiddenLayer[1]->m_iOut, dLr); eSEO靠我lse m_ppHiddenLayer[0]->Back_propagation(pdTrain, SEO靠我 m_pLogisticLayer->m_pdDelta, m_pLogisticLayer->m_ppdW, m_pLogisticLayer->m_iOut, dLr); dSEO靠我elete []pdLabel; } void Mlp::SetTrainNum(int iNum) { m_iSampleNum = iNum; } double* SEO靠我Mlp::Forward_propagation(double* pData) { double *pdForwardValue = pData; for(int n = SEO靠我0; n < m_iHiddenLayerNum; ++ n) { if(n == 0) //第一个隐层直接输入数据 { SEO靠我 pdForwardValue = m_ppHiddenLayer[n]->Forward_propagation(pData); } else //其他隐层用SEO靠我前一层的输出作为输入数据 { pdForwardValue = m_ppHiddenLayer[n]->Forward_propagation(pdFoSEO靠我rwardValue); } } return pdForwardValue; //softmax层使用最后一个隐层的输出作为输入数据 SEO靠我 // m_pLogisticLayer->Forward_propagation(m_ppHiddenLayer[m_iHiddenLayerNum-1]->m_pdOutdata); SEO靠我 // m_pLogisticLayer->Predict(m_ppHiddenLayer[m_iHiddenLayerNum-1]->m_pdOutdata); } int MlSEO靠我p::Predict(double *pInputData) { Forward_propagation(pInputData); int iResult = m_pLogSEO靠我isticLayer->Predict(m_ppHiddenLayer[m_iHiddenLayerNum-1]->m_pdOutdata); return iResult; } SEO靠我void Mlp::Setwb(vector< vector > &vvAllw, vector< vector> &vvAllb) { for (int i = 0; i < m_iHSEO靠我iddenLayerNum; ++i) { m_ppHiddenLayer[i]->Setwb(vvAllw[i], vvAllb[i]); } SEO靠我 m_pLogisticLayer->Setwb(vvAllw[m_iHiddenLayerNum], vvAllb[m_iHiddenLayerNum]); } void Mlp::WritSEO靠我ewb(const char *szName) { for(int i = 0; i < m_iHiddenLayerNum; ++i) { m_ppHSEO靠我iddenLayer[i]->Writewb(szName); } m_pLogisticLayer->Writewb(szName); } double Mlp::CSEO靠我alErrorRate(const vector &vecvalid, const vector&vecValidlabel) { int iErrorNumber = 0, iValiSEO靠我dNumber = vecValidlabel.size(); for (int i = 0; i < iValidNumber; ++i) { int iSEO靠我Result = Predict(vecvalid[i]); if (iResult != vecValidlabel[i]) { SEO靠我++iErrorNumber; } } cout << "the num of error is " << iErrorNumber << endl; SEO靠我 double dErrorRate = (double)iErrorNumber / iValidNumber; cout << "the error rate of Train SEO靠我sample by softmax is " << setprecision(10) << dErrorRate * 100 << "%" << endl; return dErrorRaSEO靠我te; } void Mlp::Readwb(const char *szName) { long dcurpos = 0, dreadsize = 0; for(SEO靠我int i = 0; i < m_iHiddenLayerNum; ++i) { dreadsize = m_ppHiddenLayer[i]->Readwb(szNaSEO靠我me, dcurpos); cout << "hiddenlayer " << i + 1 << " read bytes: " << dreadsize << endl; SEO靠我 if (-1 != dreadsize) dcurpos += dreadsize; else { SEO靠我 cout << "read wb error from HiddenLayer" << endl; return; } } SEO靠我 dreadsize = m_pLogisticLayer->Readwb(szName, dcurpos); if (-1 != dreadsize) dcurpSEO靠我os += dreadsize; else { cout << "read wb error from sofmaxLayer" << endl; SEO靠我 return; } } int* Mlp::GetHiddenSize() { return m_piHiddenLayerSize; } doSEO靠我uble* Mlp::GetHiddenOutputData() { assert(m_iHiddenLayerNum > 0); return m_ppHiddenLaySEO靠我er[m_iHiddenLayerNum-1]->m_pdOutdata; } int Mlp::GetHiddenNumber() { return m_iHiddenLaySEO靠我erNum; } //double **makeLabelSample(double **label_x) double **makeLabelSample(double label_x[SEO靠我][outnode]) { double **pplabelSample; pplabelSample = new double*[m_iSamplenum]; SEO靠我 for (int i = 0; i < m_iSamplenum; ++i) { pplabelSample[i] = new double[outnode]; SEO靠我 } for (int i = 0; i < m_iSamplenum; ++i) { for (int j = 0; j < outnode; ++SEO靠我j) pplabelSample[i][j] = label_x[i][j]; } return pplabelSample; } doubSEO靠我le **maken_train(double train_x[][innode]) { double **ppn_train; ppn_train = new doublSEO靠我e*[m_iSamplenum]; for (int i = 0; i < m_iSamplenum; ++i) { ppn_train[i] = new SEO靠我double[innode]; } for (int i = 0; i < m_iSamplenum; ++i) { for (int j = SEO靠我0; j < innode; ++j) ppn_train[i][j] = train_x[i][j]; } return ppn_train; SEO靠我 } void TestMlpMnist(const int m_iInput, const int ihidden, const int m_iOut) { const int SEO靠我ihiddenSize = 1; int phidden[ihiddenSize] = {ihidden}; // construct LogisticRegression SEO靠我 Mlp neural(m_iSamplenum, m_iInput, m_iOut, ihiddenSize, phidden); vectorvecTrain, vecvaliSEO靠我d; vectorvecValidlabel, vectrainlabel; LoadTestSampleFromJson(vecvalid, vecValidlabel, "SEO靠我../../data/mnist.json", m_iInput); LoadTestSampleFromJson(vecTrain, vectrainlabel, "../../dataSEO靠我/mnisttrain.json", m_iInput); // test int itrainnum = vecTrain.size(); neural.SetTSEO靠我rainNum(itrainnum); const int iepochs = 1; const double dLr = 0.1; neural.CalErrorSEO靠我Rate(vecvalid, vecValidlabel); for (int i = 0; i < iepochs; ++i) { neural.TraiSEO靠我nAllSample(vecTrain, vectrainlabel, dLr); neural.CalErrorRate(vecvalid, vecValidlabel); SEO靠我 } for (vector::iterator cit = vecTrain.begin(); cit != vecTrain.end(); ++cit) { SEO靠我 delete [](*cit); } for (vector::iterator cit = vecvalid.begin(); cit != vecvalid.eSEO靠我nd(); ++cit) { delete [](*cit); } } void TestMlpTheano(const int m_iInput,SEO靠我 const int ihidden, const int m_iOut) { const int ihiddenSize = 1; int phidden[ihiddenSEO靠我Size] = {ihidden}; // construct LogisticRegression Mlp neural(m_iSamplenum, m_iInput, m_SEO靠我iOut, ihiddenSize, phidden); vector vecTrain, vecw; vector vecb; vectorvecLabel; SEO靠我 vector< vector > vvAllw; vector< vector> vvAllb; const char *pcfilename = "../../data/SEO靠我theanomlp.json"; vectorvecSecondDimOfWeigh; vecSecondDimOfWeigh.push_back(m_iInput); SEO靠我 vecSecondDimOfWeigh.push_back(ihidden); LoadWeighFromJson(vvAllw, vvAllb, pcfilename, vecSeSEO靠我condDimOfWeigh); LoadTestSampleFromJson(vecTrain, vecLabel, "../../data/mnist_validall.json", SEO靠我m_iInput); cout << "loadwb ---------" << endl; int itrainnum = vecTrain.size(); neSEO靠我ural.SetTrainNum(itrainnum); neural.Setwb(vvAllw, vvAllb); cout << "Predict------------"SEO靠我 << endl; neural.CalErrorRate(vecTrain, vecLabel); for (vector::iterator cit = vecTrain.SEO靠我begin(); cit != vecTrain.end(); ++cit) { delete [](*cit); } } void mlp() SEO靠我 { //输入样本 double X[m_iSamplenum][innode]= { {0,0,0},{0,0,1},{0,1,0},{0,1,1},{1SEO靠我,0,0},{1,0,1},{1,1,0},{1,1,1} }; double Y[m_iSamplenum][outnode]={ {1, 0, 0, 0SEO靠我, 0, 0, 0, 0}, {0, 1, 0, 0, 0, 0, 0, 0}, {0, 0, 1, 0, 0, 0, 0, 0}, {0,SEO靠我 0, 0, 1, 0, 0, 0, 0}, {0, 0, 0, 0, 1, 0, 0, 0}, {0, 0, 0, 0, 0, 1, 0, 0}, SEO靠我 {0, 0, 0, 0, 0, 0, 1, 0}, {0, 0, 0, 0, 0, 0, 0, 1}, }; WORD pdLabel[outnoSEO靠我de] = {0, 1, 2, 3, 4, 5, 6, 7}; const int ihiddenSize = 2; int phidden[ihiddenSize] = {5SEO靠我, 5}; //printArr(phidden, 1); Mlp neural(m_iSamplenum, innode, outnode, ihiddenSize, phiSEO靠我dden); double **train_x, **ppdLabel; train_x = maken_train(X); //printArrDouble(trSEO靠我ain_x, m_iSamplenum, innode); ppdLabel = makeLabelSample(Y); for (int i = 0; i < 3500; +SEO靠我+i) { for (int j = 0; j < m_iSamplenum; ++j) { neural.Train(SEO靠我train_x[j], pdLabel[j], 0.1); } } cout<<"trainning complete..."<
hiddenLayer.h
[SEO靠我cpp]view plain copy #ifndef HIDDENLAYER_H #define HIDDENLAYER_H #include "neuralbase.h" class SEO靠我HiddenLayer: public NeuralBase { public: HiddenLayer(int n_i, int n_o); ~HiddenLayerSEO靠我(); double* Forward_propagation(double* input_data); void Back_propagation(double *pdInpSEO靠我utData, double *pdNextLayerDelta, double** ppdnextLayerW, int iNextLayerSEO靠我OutNum, double dLr); }; #endif
hiddenLayer.cpp
[cpp]view plain copy #include #include #include #incluSEO靠我de #include #include "hiddenLayer.h" #include "util.h" using namespace std; HiddenLayer::HiddenLSEO靠我ayer(int n_i, int n_o): NeuralBase(n_i, n_o, 0) { } HiddenLayer::~HiddenLayer() { } /***SEO靠我*********************************************************************/ /* 需要注意的是: 如果为了复现theano的测试SEO靠我结果,那么隐藏层的激活函数要选用tanh; 否则,为了mlp的训练过程,激活函数要选择sigmoid SEO靠我 */ /************************************************************************/ SEO靠我 double* HiddenLayer::Forward_propagation(double* pdInputData) { NeuralBase::Forward_propagSEO靠我ation(pdInputData); for(int i = 0; i < m_iOut; ++i) { // m_pdOutdata[i] = sigmoSEO靠我id(m_pdOutdata[i]); m_pdOutdata[i] = mytanh(m_pdOutdata[i]); } return m_pdOutdSEO靠我ata; } void HiddenLayer::Back_propagation(double *pdInputData, double *pdNextLayerDelta, SEO靠我 double** ppdnextLayerW, int iNextLayerOutNum, double dLr) { /*SEO靠我 pdInputData 为输入数据 *pdNextLayerDelta 为下一层的残差值delta,是一个大小为iNextLayerOutNum的数组 SEO靠我 **ppdnextLayerW 为此层到下一层的权值 iNextLayerOutNum 实际上就是下一层的n_out dLr SEO靠我 为学习率learning rate m_iSampleNum 为训练样本总数 */ //sigma元素个数应与本层单元个数一致,而SEO靠我网上代码有误 //作者是没有自己测试啊,测试啊 //double* sigma = new double[iNextLayerOutNum]; double* siSEO靠我gma = new double[m_iOut]; //double sigma[10]; for(int i = 0; i < m_iOut; ++i) SEO靠我sigma[i] = 0.0; for(int i = 0; i < iNextLayerOutNum; ++i) { for(int j = 0; j <SEO靠我 m_iOut; ++j) { sigma[j] += ppdnextLayerW[i][j] * pdNextLayerDelta[i]; SEO靠我 } } //计算得到本层的残差delta for(int i = 0; i < m_iOut; ++i) { m_pdDeSEO靠我lta[i] = sigma[i] * m_pdOutdata[i] * (1 - m_pdOutdata[i]); } //调整本层的权值w for(int i SEO靠我= 0; i < m_iOut; ++i) { for(int j = 0; j < m_iInput; ++j) { SEO靠我m_ppdW[i][j] += dLr * m_pdDelta[i] * pdInputData[j]; } m_pdBias[i] += dLr * m_pdSEO靠我Delta[i]; } delete[] sigma; }
logisticRegression.h
[cpp]view plain copy #ifndef LOGISTICSEO靠我REGRESSIONLAYER #define LOGISTICREGRESSIONLAYER #include "neuralbase.h" typedef unsigned shortSEO靠我 WORD; class LogisticRegression: public NeuralBase { public: LogisticRegression(int n_i,SEO靠我 int i_o, int); ~LogisticRegression(); double* Forward_propagation(double* input_data); SEO靠我 void Softmax(double* x); void Train(double *pdTrain, WORD usLabel, double dLr); voSEO靠我id SetOldWb(double ppdWeigh[][3], double arriBias[8]); int Predict(double *); void MakeLSEO靠我abels(int* piMax, double (*pplabels)[8]); }; void Test_lr(); void Testwb(); void Test_theanoSEO靠我(const int m_iInput, const int m_iOut); #endif
logisticRegression.cpp
[cpp]view plain copy #include #SEO靠我include #include #include #include #include "logisticRegression.h" #include "util.h" using namespaceSEO靠我 std; LogisticRegression::LogisticRegression(int n_i, int n_o, int n_t): NeuralBase(n_i, n_o, n_t)SEO靠我 { } LogisticRegression::~LogisticRegression() { } void LogisticRegression::Softmax(doubSEO靠我le* x) { double _max = 0.0; double _sum = 0.0; for(int i = 0; i < m_iOut; ++i) SEO靠我 { if(_max < x[i]) _max = x[i]; } for(int i = 0; i < m_iOut;SEO靠我 ++i) { x[i] = exp(x[i]-_max); _sum += x[i]; } for(int i = 0; SEO靠我i < m_iOut; ++i) { x[i] /= _sum; } } double* LogisticRegression::Forward_pSEO靠我ropagation(double* pdinputdata) { NeuralBase::Forward_propagation(pdinputdata); /*****SEO靠我*******************************************************************/ /* 调试 SEO靠我 */ /*****************************************SEO靠我*******************************/ //cout << "Forward_propagation from LogisticRegression" << SEO靠我endl; //PrintOutputData(); //cout << "over\n"; Softmax(m_pdOutdata); return SEO靠我m_pdOutdata; } int LogisticRegression::Predict(double *pdtest) { Forward_propagation(pdtSEO靠我est); /************************************************************************/ /* 调试使用SEO靠我 */ /*********************SEO靠我***************************************************/ //PrintOutputData(); int iResult = SEO靠我getMaxIndex(m_pdOutdata, m_iOut); return iResult; } void LogisticRegression::Train(double SEO靠我*pdTrain, WORD usLabel, double dLr) { Forward_propagation(pdTrain); double *pdLabel = SEO靠我new double[m_iOut]; MakeOneLabel(usLabel, pdLabel); Back_propagation(pdTrain, pdLabel, dSEO靠我Lr); delete []pdLabel; } //double LogisticRegression::CalErrorRate(const vector&vecvalid, SEO靠我const vector&vecValidlabel) //{ // int iErrorNumber = 0, iValidNumber = vecValidlabel.size();SEO靠我 // for (int i = 0; i < iValidNumber; ++i) // { // int iResult = Predict(vecvalidSEO靠我[i]); // if (iResult != vecValidlabel[i]) // { // ++iErrorNumber; /SEO靠我/ } // } // cout << "the num of error is " << iErrorNumber << endl; // double SEO靠我dErrorRate = (double)iErrorNumber / iValidNumber; // cout << "the error rate of Train sample bySEO靠我 softmax is " << setprecision(10) << dErrorRate * 100 << "%" << endl; // return dErrorRate; /SEO靠我/} void LogisticRegression::SetOldWb(double ppdWeigh[][3], double arriBias[8]) { for (int SEO靠我i = 0; i < m_iOut; ++i) { for (int j = 0; j < m_iInput; ++j) m_ppdW[i]SEO靠我[j] = ppdWeigh[i][j]; m_pdBias[i] = arriBias[i]; } cout << "Setwb----------" <SEO靠我< endl; printArrDouble(m_ppdW, m_iOut, m_iInput); printArr(m_pdBias, m_iOut); } //voSEO靠我id LogisticRegression::TrainAllSample(const vector &vecTrain, const vector&vectrainlabel, double dLr)SEO靠我 //{ // for (int j = 0; j < m_iSamplenum; ++j) // { // Train(vecTrain[j], vectrSEO靠我ainlabel[j], dLr); // } //} void LogisticRegression::MakeLabels(int* piMax, double (*pplabeSEO靠我ls)[8]) { for (int i = 0; i < m_iSamplenum; ++i) { for (int j = 0; j < m_iOuSEO靠我t; ++j) pplabels[i][j] = 0; int k = piMax[i]; pplabels[i][k] = 1.0SEO靠我; } } void Test_theano(const int m_iInput, const int m_iOut) { // construct LogistSEO靠我icRegression LogisticRegression classifier(m_iInput, m_iOut, 0); vectorvecTrain, vecvaliSEO靠我d, vecw; vector vecb; vectorvecValidlabel, vectrainlabel; LoadTestSampleFromJson(veSEO靠我cvalid, vecValidlabel, "../.../../data/mnist.json", m_iInput); LoadTestSampleFromJson(vecTrainSEO靠我, vectrainlabel, "../.../../data/mnisttrain.json", m_iInput); // test int itrainnum = vSEO靠我ecTrain.size(); classifier.m_iSamplenum = itrainnum; const int iepochs = 5; const SEO靠我double dLr = 0.1; for (int i = 0; i < iepochs; ++i) { classifier.TrainAllSamplSEO靠我e(vecTrain, vectrainlabel, dLr); if (i % 2 == 0) { cout << "PredicSEO靠我t------------" << i + 1 << endl; classifier.CalErrorRate(vecvalid, vecValidlabel); SEO靠我 } } for (vector::iterator cit = vecTrain.begin(); cit != vecTrain.end(); ++cit) SEO靠我 { delete [](*cit); } for (vector::iterator cit = vecvalid.begin(); cit !=SEO靠我 vecvalid.end(); ++cit) { delete [](*cit); } } void Test_lr() { srSEO靠我and(0); double learning_rate = 0.1; double n_epochs = 200; int test_N = 2; cSEO靠我onst int trainNum = 8, m_iInput = 3, m_iOut = 8; //int m_iOut = 2; double train_X[trainNSEO靠我um][m_iInput] = { {1, 1, 1}, {1, 1, 0}, {1, 0, 1}, {1, 0, 0}SEO靠我, {0, 1, 1}, {0, 1, 0}, {0, 0, 1}, {0, 0, 0} }; SEO靠我//sziMax存储的是最大值的下标 int sziMax[trainNum]; for (int i = 0; i < trainNum; ++i) szSEO靠我iMax[i] = trainNum - i - 1; // construct LogisticRegression LogisticRegression classifieSEO靠我r(m_iInput, m_iOut, trainNum); // Train online for(int epoch=0; epoch
neuralbase.h
[cpp]viSEO靠我ew plain copy #ifndef NEURALBASE_H #define NEURALBASE_H #includeusing std::vector; typedef unsSEO靠我igned short WORD; class NeuralBase { public: NeuralBase(int , int , int); virtual SEO靠我~NeuralBase(); virtual double* Forward_propagation(double* ); virtual void Back_propagatSEO靠我ion(double* , double* , double ); virtual void Train(double *x, WORD y, double dLr); virSEO靠我tual int Predict(double *); void Callbackwb(); void MakeOneLabel(int iMax, double *pdLabSEO靠我el); void TrainAllSample(const vector &vecTrain, const vector&vectrainlabel, double dLr); SEO靠我 double CalErrorRate(const vector &vecvalid, const vector&vecValidlabel); void Printwb(); SEO靠我 void Writewb(const char *szName); long Readwb(const char *szName, long); void Setwb(vecSEO靠我tor &vpdw, vector&vdb); void PrintOutputData(); int m_iInput; int m_iOut; intSEO靠我 m_iSamplenum; double** m_ppdW; double* m_pdBias; //本层前向传播的输出值,也是最终的预测值 doubSEO靠我le* m_pdOutdata; //反向传播时所需值 double* m_pdDelta; private: void _callbackwb(); };SEO靠我 #endif // NEURALBASE_H
neuralbase.cpp
[cpp]view plain copy #include "neuralbase.h" #include #incluSEO靠我de #include #include #include #include "util.h" using namespace std; NeuralBase::NeuralBase(int n_i,SEO靠我 int n_o, int n_t):m_iInput(n_i), m_iOut(n_o), m_iSamplenum(n_t) { m_ppdW = new double* [m_iSEO靠我Out]; for(int i = 0; i < m_iOut; ++i) { m_ppdW[i] = new double [m_iInput]; SEO靠我 } m_pdBias = new double [m_iOut]; double a = 1.0 / m_iInput; srand((unsigned)tiSEO靠我me(NULL)); for(int i = 0; i < m_iOut; ++i) { for(int j = 0; j < m_iInput; ++j)SEO靠我 m_ppdW[i][j] = uniform(-a, a); m_pdBias[i] = uniform(-a, a); } SEO靠我m_pdDelta = new double [m_iOut]; m_pdOutdata = new double [m_iOut]; } NeuralBase::~NeuralBSEO靠我ase() { Callbackwb(); delete[] m_pdOutdata; delete[] m_pdDelta; } void NeuraSEO靠我lBase::Callbackwb() { _callbackwb(); } double NeuralBase::CalErrorRate(const vector&vecvSEO靠我alid, const vector&vecValidlabel) { int iErrorNumber = 0, iValidNumber = vecValidlabel.size(SEO靠我); for (int i = 0; i < iValidNumber; ++i) { int iResult = Predict(vecvalid[i])SEO靠我; if (iResult != vecValidlabel[i]) { ++iErrorNumber; } SEO靠我 } cout << "the num of error is " << iErrorNumber << endl; double dErrorRate = (doubSEO靠我le)iErrorNumber / iValidNumber; cout << "the error rate of Train sample by softmax is " << setSEO靠我precision(10) << dErrorRate * 100 << "%" << endl; return dErrorRate; } int NeuralBase::PreSEO靠我dict(double *) { cout << "NeuralBase::Predict(double *)" << endl; return 0; } voidSEO靠我 NeuralBase::_callbackwb() { for(int i=0; i < m_iOut; i++) delete []m_ppdW[i]; SEO靠我 delete[] m_ppdW; delete[] m_pdBias; } void NeuralBase::Printwb() { cout << "***SEO靠我*m_ppdW****\n"; for(int i = 0; i < m_iOut; ++i) { for(int j = 0; j < m_iInput;SEO靠我 ++j) cout << m_ppdW[i][j] << ; cout << endl; } cout << "****m_SEO靠我pdBias****\n"; for(int i = 0; i < m_iOut; ++i) { cout << m_pdBias[i] << ; SEO靠我 } cout << endl; cout << "****output****\n"; for(int i = 0; i < m_iOut; ++i) SEO靠我 { cout << m_pdOutdata[i] << ; } cout << endl; } double* NeuralBase::FoSEO靠我rward_propagation(double* input_data) { for(int i = 0; i < m_iOut; ++i) { m_SEO靠我pdOutdata[i] = 0.0; for(int j = 0; j < m_iInput; ++j) { m_pdOutdatSEO靠我a[i] += m_ppdW[i][j]*input_data[j]; } m_pdOutdata[i] += m_pdBias[i]; } SEO靠我 return m_pdOutdata; } void NeuralBase::Back_propagation(double* input_data, double* pdlabel, SEO靠我double dLr) { for(int i = 0; i < m_iOut; ++i) { m_pdDelta[i] = pdlabel[i] - SEO靠我m_pdOutdata[i] ; for(int j = 0; j < m_iInput; ++j) { m_ppdW[i][j] SEO靠我+= dLr * m_pdDelta[i] * input_data[j] / m_iSamplenum; } m_pdBias[i] += dLr * m_pSEO靠我dDelta[i] / m_iSamplenum; } } void NeuralBase::MakeOneLabel(int imax, double *pdlabel) {SEO靠我 for (int j = 0; j < m_iOut; ++j) pdlabel[j] = 0; pdlabel[imax] = 1.0; } vSEO靠我oid NeuralBase::Writewb(const char *szName) { savewb(szName, m_ppdW, m_pdBias, m_iOut, m_iInSEO靠我put); } long NeuralBase::Readwb(const char *szName, long dstartpos) { return loadwb(szNaSEO靠我me, m_ppdW, m_pdBias, m_iOut, m_iInput, dstartpos); } void NeuralBase::Setwb(vector&vpdw, vectorSEO靠我&vdb) { assert(vpdw.size() == (DWORD)m_iOut); for (int i = 0; i < m_iOut; ++i) {SEO靠我 delete []m_ppdW[i]; m_ppdW[i] = vpdw[i]; m_pdBias[i] = vdb[i]; SEO靠我} } void NeuralBase::TrainAllSample(const vector &vecTrain, const vector&vectrainlabel, double dLSEO靠我r) { for (int j = 0; j < m_iSamplenum; ++j) { Train(vecTrain[j], vectrainlabSEO靠我el[j], dLr); } } void NeuralBase::Train(double *x, WORD y, double dLr) { (void)x; SEO靠我 (void)y; (void)dLr; cout << "NeuralBase::Train(double *x, WORD y, double dLr)" << SEO靠我endl; } void NeuralBase::PrintOutputData() { for (int i = 0; i < m_iOut; ++i) { SEO靠我 cout << m_pdOutdata[i] << ; } cout << endl; }
util.h
[cpp]view plain copy #ifndSEO靠我ef UTIL_H #define UTIL_H #include #include #include #include #include using namespace std; typedef SEO靠我unsigned char BYTE; typedef unsigned short WORD; typedef unsigned int DWORD; double sigmoid(doSEO靠我uble x); double mytanh(double dx); typedef struct stShapeWb { stShapeWb(int w, int h):wiSEO靠我dth(w), height(h){} int width; int height; }ShapeWb_S; void MakeOneLabel(int iMax, dSEO靠我ouble *pdLabel, int m_iOut); double uniform(double _min, double _max); //void printArr(T *parr, SEO靠我int num); //void printArrDouble(double **pparr, int row, int col); void initArr(double *parr, inSEO靠我t num); int getMaxIndex(double *pdarr, int num); void Printivec(const vector&ivec); void savewSEO靠我b(const char *szName, double **m_ppdW, double *m_pdBias, int irow, int icol); long SEO靠我loadwb(const char *szName, double **m_ppdW, double *m_pdBias, int irow, int icol, longSEO靠我 dstartpos); void TestLoadJson(const char *pcfilename); bool LoadvtFromJson(vector&vecTrain, vecSEO靠我tor &vecLabel, const char *filename, const int m_iInput); bool LoadwbFromJson(vector&vecTrain, vectSEO靠我or &vecLabel, const char *filename, const int m_iInput); bool LoadTestSampleFromJson(vector&vecTraiSEO靠我n, vector &vecLabel, const char *filename, const int m_iInput); bool LoadwbByByte(vector&vecTrain, SEO靠我vector &vecLabel, const char *filename, const int m_iInput); bool LoadallwbByByte(vector< vector> &SEO靠我vvAllw, vector< vector> &vvAllb, const char *filename, const int m_iInput, coSEO靠我nst int ihidden, const int m_iOut); bool LoadWeighFromJson(vector< vector> &vvAllw, vector< vectorSEO靠我 > &vvAllb, const char *filename, const vector&vecSecondDimOfWeigh); void MaSEO靠我keCnnSample(double arr[2][64], double *pdImage, int iImageWidth, int iNumOfImage ); void MakeCnnWeSEO靠我igh(double *, int iNumOfKernel); templatevoid printArr(T *parr, int num) { cout << "****prSEO靠我intArr****" << endl; for (int i = 0; i < num; ++i) cout << parr[i] << ; cout SEO靠我<< endl; } templatevoid printArrDouble(T **pparr, int row, int col) { cout << "****printSEO靠我ArrDouble****" << endl; for (int i = 0; i < row; ++i) { for (int j = 0; j < coSEO靠我l; ++j) { cout << pparr[i][j] << ; } cout << endl; SEO靠我 } } #endif
util.cpp
[cpp]view plain copy #include "util.h" #include <iostream> #include <ctiSEO靠我me> #include <cmath> #include <cassert> #include <fstream> #include <cstring> #include <stSEO靠我ack> #include <iomanip> using namespace std; int getMaxIndex(double *pdarr, int num) { SEO靠我 double dmax = -1; int iMax = -1; for(int i = 0; i < num; ++i) { if (pdaSEO靠我rr[i] > dmax) { dmax = pdarr[i]; iMax = i; } }SEO靠我 return iMax; } double sigmoid(double dx) { return 1.0/(1.0+exp(-dx)); } doublSEO靠我e mytanh(double dx) { double e2x = exp(2 * dx); return (e2x - 1) / (e2x + 1); } doSEO靠我uble uniform(double _min, double _max) { return rand()/(RAND_MAX + 1.0) * (_max - _min) + _mSEO靠我in; } void initArr(double *parr, int num) { for (int i = 0; i < num; ++i) parrSEO靠我[i] = 0.0; } void savewb(const char *szName, double **m_ppdW, double *m_pdBias, iSEO靠我nt irow, int icol) { FILE *pf; if( (pf = fopen(szName, "ab" )) == NULL ) { SEO靠我 printf( "File coulkd not be opened " ); return; } int isizeofelem = sSEO靠我izeof(double); for (int i = 0; i < irow; ++i) { if (fwrite((const void*)m_ppdWSEO靠我[i], isizeofelem, icol, pf) != icol) { fputs ("Writing m_ppdW error",stderr)SEO靠我; return; } } if (fwrite((const void*)m_pdBias, isizeofelem, iroSEO靠我w, pf) != irow) { fputs ("Writing m_ppdW error",stderr); return; } SEO靠我 fclose(pf); } long loadwb(const char *szName, double **m_ppdW, double *m_pdBias, SEO靠我 int irow, int icol, long dstartpos) { FILE *pf; long dtotalbyte = 0, dreadsize; SEO靠我 if( (pf = fopen(szName, "rb" )) == NULL ) { printf( "File coulkd not be openedSEO靠我 " ); return -1; } //让文件指针偏移到正确位置 fseek(pf, dstartpos , SEEK_SET); SEO靠我 int isizeofelem = sizeof(double); for (int i = 0; i < irow; ++i) { dreadsizSEO靠我e = fread((void*)m_ppdW[i], isizeofelem, icol, pf); if (dreadsize != icol) { SEO靠我 fputs ("Reading m_ppdW error",stderr); return -1; } //每SEO靠我次成功读取,都要加到dtotalbyte中,最后返回 dtotalbyte += dreadsize; } dreadsize = fread(m_pdBiSEO靠我as, isizeofelem, irow, pf); if (dreadsize != irow) { fputs ("Reading m_pdBias SEO靠我error",stderr); return -1; } dtotalbyte += dreadsize; dtotalbyte *= isizSEO靠我eofelem; fclose(pf); return dtotalbyte; } void Printivec(const vector<int> &ivec) SEO靠我 { for (vector<int>::const_iterator it = ivec.begin(); it != ivec.end(); ++it) { SEO靠我 cout << *it << ; } cout << endl; } void TestLoadJson(const char *pcfilename) SEO靠我{ vector<double *> vpdw; vector<double> vdb; vector< vector<double*> > vvAllw; SEO靠我 vector< vector<double> > vvAllb; int m_iInput = 28 * 28, ihidden = 500, m_iOut = 10; SEO靠我LoadallwbByByte(vvAllw, vvAllb, pcfilename, m_iInput, ihidden, m_iOut ); } //read vt from mnist,SEO靠我 format is [[[], [],..., []],[1, 3, 5,..., 7]] bool LoadvtFromJson(vector<double*> &vecTrain, vectSEO靠我or<WORD> &vecLabel, const char *filename, const int m_iInput) { cout << "loadvtFromJson" << SEO靠我endl; const int ciStackSize = 10; const int ciFeaturesize = m_iInput; int arriStacSEO靠我k[ciStackSize], iTop = -1; ifstream ifs; ifs.open(filename, ios::in); assert(ifs.iSEO靠我s_open()); BYTE ucRead, ucLeftbrace, ucRightbrace, ucComma, ucSpace; ucLeftbrace = [; SEO靠我 ucRightbrace = ]; ucComma = ,; ucSpace = 0; ifs >> ucRead; assert(ucReadSEO靠我 == ucLeftbrace); //栈中全部存放左括号,用1代表,0说明清除 arriStack[++iTop] = 1; //样本train开始 SEO靠我ifs >> ucRead; assert(ucRead == ucLeftbrace); arriStack[++iTop] = 1;//iTop is 1 inSEO靠我t iIndex; bool isdigit = false; double dread, *pdvt; //load vt sample while SEO靠我(iTop > 0) { if (isdigit == false) { ifs >> ucRead; SEO靠我 isdigit = true; if (ucRead == ucComma) { //next SEO靠我char is space or leftbrace // ifs >> ucRead; isdiSEO靠我git = false; continue; } if (ucRead == ucSpace) SEO靠我 { //if pdvt is null, next char is leftbrace; //else nextSEO靠我 char is double value if (pdvt == NULL) isdigit = false; SEO靠我 continue; } if (ucRead == ucLeftbrace) { SEO靠我 pdvt = new double[ciFeaturesize]; memset(pdvt, 0, ciFeaturesize * SEO靠我sizeof(double)); //iIndex数组下标 iIndex = 0; arriSEO靠我Stack[++iTop] = 1; continue; } if (ucRead == ucRightbrSEO靠我ace) { if (pdvt != NULL) { aSEO靠我ssert(iIndex == ciFeaturesize); vecTrain.push_back(pdvt); SEO靠我 pdvt = NULL; } isdigit = false; arriStack[iTSEO靠我op--] = 0; continue; } } else { SEO靠我 ifs >> dread; pdvt[iIndex++] = dread; isdigit = false; SEO靠我 } }; //next char is dot ifs >> ucRead; assert(ucRead == ucComma); couSEO靠我t << vecTrain.size() << endl; //读取label WORD usread; isdigit = false; while SEO靠我(iTop > -1 && ifs.eof() == false) { if (isdigit == false) { SEO靠我ifs >> ucRead; isdigit = true; if (ucRead == ucComma) { SEO靠我 //next char is space or leftbrace // ifs >> ucRead;SEO靠我 // isdigit = false; continue; } if (SEO靠我ucRead == ucSpace) { //if pdvt is null, next char is leftbrace; SEO靠我 //else next char is double value if (pdvt == NULL) SEO靠我 isdigit = false; continue; } if (ucRead == ucLeftSEO靠我brace) { arriStack[++iTop] = 1; continue; SEO靠我 } //右括号的下一个字符是右括号(最后一个字符) if (ucRead == ucRightbrace) SEO靠我 { isdigit = false; arriStack[iTop--] = 0; coSEO靠我ntinue; } } else { ifs >> usread; SEO靠我 vecLabel.push_back(usread); isdigit = false; } }; assert(vSEO靠我ecLabel.size() == vecTrain.size()); assert(iTop == -1); ifs.close(); return true; SEO靠我 } bool testjsonfloat(const char *filename) { vector<double> vecTrain; cout << "tesSEO靠我tjsondouble" << endl; const int ciStackSize = 10; int arriStack[ciStackSize], iTop = -1;SEO靠我 ifstream ifs; ifs.open(filename, ios::in); assert(ifs.is_open()); BYTE ucReSEO靠我ad, ucLeftbrace, ucRightbrace, ucComma; ucLeftbrace = [; ucRightbrace = ]; ucCommaSEO靠我 = ,; ifs >> ucRead; assert(ucRead == ucLeftbrace); //栈中全部存放左括号,用1代表,0说明清除 aSEO靠我rriStack[++iTop] = 1; //样本train开始 ifs >> ucRead; assert(ucRead == ucLeftbrace); SEO靠我 arriStack[++iTop] = 1;//iTop is 1 double fread; bool isdigit = false; while (iSEO靠我Top > -1) { if (isdigit == false) { ifs >> ucRead; SEO靠我 isdigit = true; if (ucRead == ucComma) { //next cSEO靠我har is space or leftbrace // ifs >> ucRead; isdigSEO靠我it = false; continue; } if (ucRead == ) SEO靠我 continue; if (ucRead == ucLeftbrace) { arriStack[SEO靠我++iTop] = 1; continue; } if (ucRead == ucRightbrace) SEO靠我 { isdigit = false; //右括号的下一个字符是右括号(最后一个字符) SEO靠我 arriStack[iTop--] = 0; continue; } } eSEO靠我lse { ifs >> fread; vecTrain.push_back(fread); iSEO靠我sdigit = false; } } ifs.close(); return true; } bool LoadwbFromJson(SEO靠我vector<double*> &vecTrain, vector<double> &vecLabel, const char *filename, const int m_iInput) { SEO靠我 cout << "loadvtFromJson" << endl; const int ciStackSize = 10; const int ciFeaturesiSEO靠我ze = m_iInput; int arriStack[ciStackSize], iTop = -1; ifstream ifs; ifs.open(filenSEO靠我ame, ios::in); assert(ifs.is_open()); BYTE ucRead, ucLeftbrace, ucRightbrace, ucComma, uSEO靠我cSpace; ucLeftbrace = [; ucRightbrace = ]; ucComma = ,; ucSpace = 0; iSEO靠我fs >> ucRead; assert(ucRead == ucLeftbrace); //栈中全部存放左括号,用1代表,0说明清除 arriStack[++iTSEO靠我op] = 1; //样本train开始 ifs >> ucRead; assert(ucRead == ucLeftbrace); arriStackSEO靠我[++iTop] = 1;//iTop is 1 int iIndex; bool isdigit = false; double dread, *pdvt; SEO靠我 //load vt sample while (iTop > 0) { if (isdigit == false) { SEO靠我 ifs >> ucRead; isdigit = true; if (ucRead == ucComma) SEO靠我 { //next char is space or leftbrace // ifs SEO靠我>> ucRead; isdigit = false; continue; } SEO靠我 if (ucRead == ucSpace) { //if pdvt is null, next char is leftbraSEO靠我ce; //else next char is double value if (pdvt == NULL) SEO靠我 isdigit = false; continue; } if (ucRead =SEO靠我= ucLeftbrace) { pdvt = new double[ciFeaturesize]; SEO靠我 memset(pdvt, 0, ciFeaturesize * sizeof(double)); //iIndex数组下标 iSEO靠我Index = 0; arriStack[++iTop] = 1; continue; } SEO靠我 if (ucRead == ucRightbrace) { if (pdvt != NULL) SEO靠我 { assert(iIndex == ciFeaturesize); vecTrain.pushSEO靠我_back(pdvt); pdvt = NULL; } isdigit = falsSEO靠我e; arriStack[iTop--] = 0; continue; } } SEO靠我 else { ifs >> dread; pdvt[iIndex++] = dread; SEO靠我 isdigit = false; } }; //next char is dot ifs >> ucRead; assSEO靠我ert(ucRead == ucComma); cout << vecTrain.size() << endl; //读取label double usread; SEO靠我 isdigit = false; while (iTop > -1 && ifs.eof() == false) { if (isdigit =SEO靠我= false) { ifs >> ucRead; isdigit = true; if (ucSEO靠我Read == ucComma) { //next char is space or leftbrace SEO靠我 // ifs >> ucRead; // isdigit = false; continue;SEO靠我 } if (ucRead == ucSpace) { //if pdvt isSEO靠我 null, next char is leftbrace; //else next char is double value SEO靠我if (pdvt == NULL) isdigit = false; continue; }SEO靠我 if (ucRead == ucLeftbrace) { arriStack[++iTop] = 1; SEO靠我 continue; } //右括号的下一个字符是右括号(最后一个字符) if (uSEO靠我cRead == ucRightbrace) { isdigit = false; arriStacSEO靠我k[iTop--] = 0; continue; } } else { SEO靠我 ifs >> usread; vecLabel.push_back(usread); isdigit = false; SEO靠我 } }; assert(vecLabel.size() == vecTrain.size()); assert(iTop == -1); SEO靠我 ifs.close(); return true; } bool vec2double(vector<BYTE> &vecDigit, double &dvalue) {SEO靠我 if (vecDigit.empty()) return false; int ivecsize = vecDigit.size(); conSEO靠我st int iMaxlen = 50; char szdigit[iMaxlen]; assert(iMaxlen > ivecsize); memset(szdSEO靠我igit, 0, iMaxlen); int i; for (i = 0; i < ivecsize; ++i) szdigit[i] = vecDigitSEO靠我[i]; szdigit[i++] = \0; vecDigit.clear(); dvalue = atof(szdigit); return truSEO靠我e; } bool vec2short(vector<BYTE> &vecDigit, WORD &usvalue) { if (vecDigit.empty()) SEO靠我 return false; int ivecsize = vecDigit.size(); const int iMaxlen = 50; char szSEO靠我digit[iMaxlen]; assert(iMaxlen > ivecsize); memset(szdigit, 0, iMaxlen); int i; SEO靠我 for (i = 0; i < ivecsize; ++i) szdigit[i] = vecDigit[i]; szdigit[i++] = \0; SEO靠我 vecDigit.clear(); usvalue = atoi(szdigit); return true; } void readDigitFromJson(iSEO靠我fstream &ifs, vector<double*> &vecTrain, vector<WORD> &vecLabel, vector<BYTSEO靠我E> &vecDigit, double *&pdvt, int &iIndex, const int ciFeaturesize, int *arrSEO靠我Stack, int &iTop, bool bFirstlist) { BYTE ucRead; WORD usvalue; double dvalue; SEO靠我 const BYTE ucLeftbrace = [, ucRightbrace = ], ucComma = ,, ucSpace = ; ifs.read((char*)(SEO靠我&ucRead), 1); switch (ucRead) { case ucLeftbrace: { ifSEO靠我 (bFirstlist) { pdvt = new double[ciFeaturesize]; SEO靠我memset(pdvt, 0, ciFeaturesize * sizeof(double)); iIndex = 0; } SEO靠我 arrStack[++iTop] = 1; break; } case ucComma: { SEO靠我 //next char is space or leftbrace if (bFirstlist) { SEO靠我 if (vecDigit.empty() == false) { vec2double(vecDSEO靠我igit, dvalue); pdvt[iIndex++] = dvalue; } } SEO靠我 else { if(vec2short(vecDigit, usvalue)) SEO靠我 vecLabel.push_back(usvalue); } break; } case uSEO靠我cSpace: break; case ucRightbrace: { if (bFirstlist) SEO靠我 { if (pdvt != NULL) { vec2doSEO靠我uble(vecDigit, dvalue); pdvt[iIndex++] = dvalue; vecTraiSEO靠我n.push_back(pdvt); pdvt = NULL; } assert(iSEO靠我Index == ciFeaturesize); } else { if(vecSEO靠我2short(vecDigit, usvalue)) vecLabel.push_back(usvalue); } SEO靠我 arrStack[iTop--] = 0; break; } default: { SEO靠我 vecDigit.push_back(ucRead); break; } } } void readDoubleFrSEO靠我omJson(ifstream &ifs, vector<double*> &vecTrain, vector<double> &vecLabel, SEO靠我vector<BYTE> &vecDigit, double *&pdvt, int &iIndex, const int ciFeaturesizeSEO靠我, int *arrStack, int &iTop, bool bFirstlist) { BYTE ucRead; double dvalue; constSEO靠我 BYTE ucLeftbrace = [, ucRightbrace = ], ucComma = ,, ucSpace = ; ifs.read((char*)(&ucRead), SEO靠我1); switch (ucRead) { case ucLeftbrace: { if (bFirstliSEO靠我st) { pdvt = new double[ciFeaturesize]; memset(pdvSEO靠我t, 0, ciFeaturesize * sizeof(double)); iIndex = 0; } aSEO靠我rrStack[++iTop] = 1; break; } case ucComma: { SEO靠我 //next char is space or leftbrace if (bFirstlist) { SEO靠我 if (vecDigit.empty() == false) { vec2double(vecDigit, dvalSEO靠我ue); pdvt[iIndex++] = dvalue; } } SEO靠我 else { if(vec2double(vecDigit, dvalue)) vecLSEO靠我abel.push_back(dvalue); } break; } case ucSpace: SEO靠我 break; case ucRightbrace: { if (bFirstlist) SEO靠我 { if (pdvt != NULL) { vec2double(vecDigSEO靠我it, dvalue); pdvt[iIndex++] = dvalue; vecTrain.push_backSEO靠我(pdvt); pdvt = NULL; } assert(iIndex == ciSEO靠我Featuresize); } else { if(vec2double(vecSEO靠我Digit, dvalue)) vecLabel.push_back(dvalue); } arrSSEO靠我tack[iTop--] = 0; break; } default: { vecDSEO靠我igit.push_back(ucRead); break; } } } bool LoadallwbByByte(vector< SEO靠我vector<double*> > &vvAllw, vector< vector<double> > &vvAllb, const char *filename, SEO靠我 const int m_iInput, const int ihidden, const int m_iOut) { cout << "LoadallwbByByte" <SEO靠我< endl; const int szistsize = 10; int ciFeaturesize = m_iInput; const BYTE ucLeftbSEO靠我race = [, ucRightbrace = ], ucComma = ,, ucSpace = ; int arrStack[szistsize], iTop = -1, iIndSEO靠我ex = 0; ifstream ifs; ifs.open(filename, ios::in | ios::binary); assert(ifs.is_opeSEO靠我n()); double *pdvt; BYTE ucRead; ifs.read((char*)(&ucRead), 1); assert(ucReaSEO靠我d == ucLeftbrace); //栈中全部存放左括号,用1代表,0说明清除 arrStack[++iTop] = 1; ifs.read((char*)(&SEO靠我ucRead), 1); assert(ucRead == ucLeftbrace); arrStack[++iTop] = 1;//iTop is 1 ifs.rSEO靠我ead((char*)(&ucRead), 1); assert(ucRead == ucLeftbrace); arrStack[++iTop] = 1;//iTop is SEO靠我2 vector<BYTE> vecDigit; vector<double *> vpdw; vector<double> vdb; while (iSEO靠我Top > 1 && ifs.eof() == false) { readDoubleFromJson(ifs, vpdw, vdb, vecDigit, pdvt, SEO靠我iIndex, m_iInput, arrStack, iTop, true); }; //next char is dot ifs.read((char*)(&uSEO靠我cRead), 1);; assert(ucRead == ucComma); cout << vpdw.size() << endl; //next char iSEO靠我s space while (iTop > 0 && ifs.eof() == false) { readDoubleFromJson(ifs, vpdw,SEO靠我 vdb, vecDigit, pdvt, iIndex, m_iInput, arrStack, iTop, false); }; assert(vpdw.size() ==SEO靠我 vdb.size()); assert(iTop == 0); vvAllw.push_back(vpdw); vvAllb.push_back(vdb); SEO靠我 //clear vpdw and pdb s contents vpdw.clear(); vdb.clear(); //next char is commSEO靠我a ifs.read((char*)(&ucRead), 1);; assert(ucRead == ucComma); //next char is space SEO靠我 ifs.read((char*)(&ucRead), 1);; assert(ucRead == ucSpace); ifs.read((char*)(&ucReaSEO靠我d), 1); assert(ucRead == ucLeftbrace); arrStack[++iTop] = 1;//iTop is 1 ifs.read((SEO靠我char*)(&ucRead), 1); assert(ucRead == ucLeftbrace); arrStack[++iTop] = 1;//iTop is 2 SEO靠我 while (iTop > 1 && ifs.eof() == false) { readDoubleFromJson(ifs, vpdw, vdb, vecDiSEO靠我git, pdvt, iIndex, ihidden, arrStack, iTop, true); }; //next char is dot ifs.read(SEO靠我(char*)(&ucRead), 1);; assert(ucRead == ucComma); cout << vpdw.size() << endl; //nSEO靠我ext char is space while (iTop > -1 && ifs.eof() == false) { readDoubleFromJsonSEO靠我(ifs, vpdw, vdb, vecDigit, pdvt, iIndex, ihidden, arrStack, iTop, false); }; assert(vpdwSEO靠我.size() == vdb.size()); assert(iTop == -1); vvAllw.push_back(vpdw); vvAllb.push_baSEO靠我ck(vdb); //clear vpdw and pdb s contents vpdw.clear(); vdb.clear(); //close SEO靠我file ifs.close(); return true; } bool LoadWeighFromJson(vector< vector<double*> > &vSEO靠我vAllw, vector< vector<double> > &vvAllb, const char *filename, const vector<iSEO靠我nt> &vecSecondDimOfWeigh) { cout << "LoadWeighFromJson" << endl; const int szistsize =SEO靠我 10; const BYTE ucLeftbrace = [, ucRightbrace = ], ucComma = ,, ucSpace = ; int arrStacSEO靠我k[szistsize], iTop = -1, iIndex = 0; ifstream ifs; ifs.open(filename, ios::in | ios::binSEO靠我ary); assert(ifs.is_open()); double *pdvt; BYTE ucRead; ifs.read((char*)(&ucSEO靠我Read), 1); assert(ucRead == ucLeftbrace); //栈中全部存放左括号,用1代表,0说明清除 arrStack[++iTop] SEO靠我= 1; ifs.read((char*)(&ucRead), 1); assert(ucRead == ucLeftbrace); arrStack[++iTopSEO靠我] = 1;//iTop is 1 ifs.read((char*)(&ucRead), 1); assert(ucRead == ucLeftbrace); arSEO靠我rStack[++iTop] = 1;//iTop is 2 int iVecWeighSize = vecSecondDimOfWeigh.size(); vector<BYSEO靠我TE> vecDigit; vector<double *> vpdw; vector<double> vdb; //读取iVecWeighSize个[w,b] SEO靠我 for (int i = 0; i < iVecWeighSize; ++i) { int iDimesionOfWeigh = vecSecondDimOfSEO靠我Weigh[i]; while (iTop > 1 && ifs.eof() == false) { readDoubleFromJSEO靠我son(ifs, vpdw, vdb, vecDigit, pdvt, iIndex, iDimesionOfWeigh, arrStack, iTop, true); }; SEO靠我 //next char is dot ifs.read((char*)(&ucRead), 1);; assert(ucRead == ucCoSEO靠我mma); cout << vpdw.size() << endl; //next char is space while (iTop > SEO靠我0 && ifs.eof() == false) { readDoubleFromJson(ifs, vpdw, vdb, vecDigit, pdvtSEO靠我, iIndex, iDimesionOfWeigh, arrStack, iTop, false); }; assert(vpdw.size() == vdbSEO靠我.size()); assert(iTop == 0); vvAllw.push_back(vpdw); vvAllb.push_back(SEO靠我vdb); //clear vpdw and pdb s contents vpdw.clear(); vdb.clear(); SEO靠我 //如果最后一对[w,b]读取完毕,就退出,下一个字符是右括号] if (i >= iVecWeighSize - 1) { SEO靠我 break; } //next char is comma ifs.read((char*)(&ucRead), 1);; SEO靠我 assert(ucRead == ucComma); //next char is space ifs.read((char*)(&ucRead), 1SEO靠我);; assert(ucRead == ucSpace); ifs.read((char*)(&ucRead), 1); assert(uSEO靠我cRead == ucLeftbrace); arrStack[++iTop] = 1;//iTop is 1 ifs.read((char*)(&ucReadSEO靠我), 1); assert(ucRead == ucLeftbrace); arrStack[++iTop] = 1;//iTop is 2 } SEO靠我 ifs.read((char*)(&ucRead), 1);; assert(ucRead == ucRightbrace); --iTop; asserSEO靠我t(iTop == -1); //close file ifs.close(); return true; } //read vt from mnszistSEO靠我, format is [[[], [],..., []],[1, 3, 5,..., 7]] bool LoadTestSampleFromJson(vector<double*> &vecTrSEO靠我ain, vector<WORD> &vecLabel, const char *filename, const int m_iInput) { cout << "LoadTestSaSEO靠我mpleFromJson" << endl; const int szistsize = 10; const int ciFeaturesize = m_iInput; SEO靠我 const BYTE ucLeftbrace = [, ucRightbrace = ], ucComma = ,, ucSpace = ; int arrStack[szistsSEO靠我ize], iTop = -1, iIndex = 0; ifstream ifs; ifs.open(filename, ios::in | ios::binary); SEO靠我 assert(ifs.is_open()); double *pdvt; BYTE ucRead; ifs.read((char*)(&ucRead), 1SEO靠我); assert(ucRead == ucLeftbrace); //栈中全部存放左括号,用1代表,0说明清除 arrStack[++iTop] = 1; SEO靠我 ifs.read((char*)(&ucRead), 1); assert(ucRead == ucLeftbrace); arrStack[++iTop] = 1;//SEO靠我iTop is 1 vector<BYTE> vecDigit; while (iTop > 0 && ifs.eof() == false) { SEO靠我 readDigitFromJson(ifs, vecTrain, vecLabel, vecDigit, pdvt, iIndex, ciFeaturesize, arrStack, iTop, SEO靠我true); }; //next char is dot ifs >> ucRead; assert(ucRead == ucComma); SEO靠我 cout << vecTrain.size() << endl; //next char is space // ifs.read((char*)(&ucRead), 1); SEO靠我 // ifs.read((char*)(&ucRead), 1); // assert(ucRead == ucLeftbrace); while (iTop > -1SEO靠我 && ifs.eof() == false) { readDigitFromJson(ifs, vecTrain, vecLabel, vecDigit, pdvt,SEO靠我 iIndex, ciFeaturesize, arrStack, iTop, false); }; assert(vecLabel.size() == vecTrain.siSEO靠我ze()); assert(iTop == -1); ifs.close(); return true; } void MakeOneLabel(int iSEO靠我Max, double *pdLabel, int m_iOut) { for (int j = 0; j < m_iOut; ++j) pdLabel[j] = SEO靠我0; pdLabel[iMax] = 1.0; } void MakeCnnSample(double arrInput[2][64], double *pdImage, int SEO靠我iImageWidth, int iNumOfImage) { int iImageSize = iImageWidth * iImageWidth; for (int kSEO靠我 = 0; k < iNumOfImage; ++k) { int iStart = k *iImageSize; for (int i = 0; SEO靠我i < iImageWidth; ++i) { for (int j = 0; j < iImageWidth; ++j) SEO靠我{ int iIndex = iStart + i * iImageWidth + j; pdImage[iIndex] = 1SEO靠我; pdImage[iIndex] += i + j; if (k > 0) pdISEO靠我mage[iIndex] -= 1; arrInput[k][i * iImageWidth +j] = pdImage[iIndex]; SEO靠我 //pdImage[iIndex] /= 15.0 ; } } } cout << "input image iSEO靠我s\n"; for (int k = 0; k < iNumOfImage; ++k) { int iStart = k *iImageSize; SEO靠我 cout << "k is " << k <<endl; for (int i = 0; i < iImageWidth; ++i) { SEO靠我 for (int j = 0; j < iImageWidth; ++j) { int iIndex = i * iSEO靠我ImageWidth + j; double dValue = arrInput[k][iIndex]; cout << dVaSEO靠我lue << ; } cout << endl; } cout << endl; } SEO靠我 cout << endl; } void MakeCnnWeigh(double *pdKernel, int iNumOfKernel) { const int iKSEO靠我ernelWidth = 3; double iSum = 0; double arrKernel[iKernelWidth][iKernelWidth] = {{4, 7, SEO靠我1}, {3, 8, 5}, {3, 2, 3}}; double arr2[iKeSEO靠我rnelWidth][iKernelWidth] = {{6, 5, 4}, {5, 4, 3}, SEO靠我 {4, 3, 2}}; for (int k = 0; k < iNumOfKernelSEO靠我; ++k) { int iStart = k * iKernelWidth * iKernelWidth; for (int i = 0; i <SEO靠我 iKernelWidth; ++i) { for (int j = 0; j < iKernelWidth; ++j) {SEO靠我 int iIndex = i * iKernelWidth + j + iStart; pdKernel[iIndex] = SEO靠我i + j + 2; if (k > 0) pdKernel[iIndex] = arrKernel[i][j]; SEO靠我 iSum += pdKernel[iIndex]; } } } cout << "sum is SEO靠我" << iSum << endl; for (int k = 0; k < iNumOfKernel; ++k) { cout << "kernel :"SEO靠我 << k << endl; int iStart = k * iKernelWidth * iKernelWidth; for (int i = 0; i <SEO靠我 iKernelWidth; ++i) { for (int j = 0; j < iKernelWidth; ++j) {SEO靠我 int iIndex = i * iKernelWidth + j + iStart; //pdKernel[iIndex] SEO靠我/= (double)iSum; cout << pdKernel[iIndex] << ; } coutSEO靠我 << endl; } cout << endl; } cout << endl; }训练两轮,生成theano权值的代码
cnn_mlSEO靠我p_theano.py
[python]view plain copy #coding=utf-8 import cPickle import gzip import os importSEO靠我 sys import time import json import numpy import theano import theano.tensor as T from tSEO靠我heano.tensor.signal import downsample from theano.tensor.nnet import conv from logistic_sgd impoSEO靠我rt LogisticRegression, load_data from mlp import HiddenLayer class LeNetConvPoolLayer(object): SEO靠我 """Pool Layer of a convolutional network """ def __init__(self, rng, input, filter_shape,SEO靠我 image_shape, poolsize=(2, 2)): """ Allocate a LeNetConvPoolLayer with shared varSEO靠我iable internal parameters. :type rng: numpy.random.RandomState :param rng: a randoSEO靠我m number generator used to initialize weights :type input: theano.tensor.dtensor4 SEO靠我:param input: symbolic image tensor, of shape image_shape :type filter_shape: tuple or listSEO靠我 of length 4 :param filter_shape: (number of filters, num input feature maps, SEO靠我 filter height,filter width) :type image_shape: tuple or list of length 4 SEO靠我 :param image_shape: (batch size, num input feature maps, imagSEO靠我e height, image width) :type poolsize: tuple or list of length 2 :param poolsize: SEO靠我the downsampling (pooling) factor (#rows,#cols) """ assert image_shape[1] == filtSEO靠我er_shape[1] self.input = input # there are "num input feature maps * filter heigSEO靠我ht * filter width" # inputs to each hidden unit fan_in = numpy.prod(filter_shapeSEO靠我[1:]) # each unit in the lower layer receives a gradient from: # "num output feaSEO靠我ture maps * filter height * filter width" / # pooling size fan_out = (filter_sSEO靠我hape[0] * numpy.prod(filter_shape[2:]) / numpy.prod(poolsize)) # initSEO靠我ialize weights with random weights W_bound = numpy.sqrt(6. / (fan_in + fan_out)) SEO靠我 self.W = theano.shared(numpy.asarray( rng.uniform(low=-W_bound, high=W_bound, size=fiSEO靠我lter_shape), dtype=theano.config.floatX), borrow=TrueSEO靠我) # the bias is a 1D tensor -- one bias per output feature map b_values = numpy.SEO靠我zeros((filter_shape[0],), dtype=theano.config.floatX) self.b = theano.shared(value=b_valueSEO靠我s, borrow=True) # convolve input feature maps with filters conv_out = conv.conv2SEO靠我d(input=input, filters=self.W, filter_shape=filter_shape, image_shape=image_shape)SEO靠我 # downsample each feature map individually, using maxpooling pooled_out = downsSEO靠我ample.max_pool_2d(input=conv_out, ds=poolsize, ignore_SEO靠我border=True) # add the bias term. Since the bias is a vector (1D array), we first SEO靠我 # reshape it to a tensor of shape (1,n_filters,1,1). Each bias will # thus be broadcasteSEO靠我d across mini-batches and feature map # width & height self.output = T.tanh(poolSEO靠我ed_out + self.b.dimshuffle(x, 0, x, x)) # store parameters of this layer self.paSEO靠我rams = [self.W, self.b] def getDataNumpy(layers): data = [] for layer in layers: SEO靠我 wb = layer.params w, b = wb[0].get_value(), wb[1].get_value() data.append([SEO靠我w, b]) return data def getDataJson(layers): data = [] i = 0 for layer in lSEO靠我ayers: w, b = layer.params # print ..layer is, i w, b = w.get_value(),SEO靠我 b.get_value() wshape = w.shape # print ...the shape of w is, wshape iSEO靠我f len(wshape) == 2: w = w.transpose() else: for k in xrange(wsSEO靠我hape[0]): for j in xrange(wshape[1]): w[k][j] = numpy.rot90(SEO靠我w[k][j], 2) w = w.reshape((wshape[0], numpy.prod(wshape[1:]))) w = w.tolist(SEO靠我) b = b.tolist() data.append([w, b]) i += 1 return data def wrSEO靠我itefile(data, name = ../../tmp/src/data/theanocnn.json): print (writefile is + name) f SEO靠我= open(name, "wb") json.dump(data,f) f.close() def readfile(layers, nkerns, name = ../SEO靠我../tmp/src/data/theanocnn.json): # Load the dataset print (readfile is + name) f SEO靠我= open(name, rb) data = json.load(f) f.close() readwb(data, layers, nkerns) def SEO靠我readwb(data, layers, nkerns): i = 0 kernSize = len(nkerns) inputnum = 1 for SEO靠我layer in layers: w, b = data[i] w = numpy.array(w, dtype=float32) b = SEO靠我numpy.array(b, dtype=float32) # print ..layer is, i # print w.shape ifSEO靠我 i >= kernSize: w = w.transpose() else: w = w.reshape((nkerns[SEO靠我i], inputnum, 5, 5)) for k in xrange(nkerns[i]): for j in xrange(inpSEO靠我utnum): c = w[k][j] w[k][j] = numpy.rot90(c, 2) SEO靠我 inputnum = nkerns[i] # print ..readwb ,transpose and rot180 # print w.shapSEO靠我e layer.W.set_value(w, borrow=True) layer.b.set_value(b, borrow=True) SEO靠我i += 1 def loadwb(classifier, name=theanocnn.json): data = json.load(open(name, rb)) wSEO靠我, b = data print type(w) w = numpy.array(w, dtype=float32).transpose() classifier.SEO靠我W.set_value(w, borrow=True) classifier.b.set_value(b, borrow=True) def savewb(classifier, naSEO靠我me=theanocnn.json): w, b = classifier.params w = w.get_value().transpose().tolist() SEO靠我 b = b.get_value().tolist() data = [w, b] json.dump(data, open(name, wb)) def evaluatSEO靠我e_lenet5(learning_rate=0.1, n_epochs=2, dataset=../../data/mnist.pkl, SEO靠我 nkerns=[20, 50], batch_size=500): """ Demonstrates lenet on MNIST dataset :SEO靠我type learning_rate: float :param learning_rate: learning rate used (factor for the stochastic SEO靠我 gradient) :type n_epochs: int :param n_epochs: maximal number ofSEO靠我 epochs to run the optimizer :type dataset: string :param dataset: path to the dataset useSEO靠我d for training /testing (MNIST here) :type nkerns: list of ints :param nkerns: number of kSEO靠我ernels on each layer """ rng = numpy.random.RandomState(23455) datasets = load_dataSEO靠我(dataset) train_set_x, train_set_y = datasets[0] valid_set_x, valid_set_y = datasets[1] SEO靠我 test_set_x, test_set_y = datasets[2] # compute number of minibatches for training, validSEO靠我ation and testing n_train_batches = train_set_x.get_value(borrow=True).shape[0] n_valid_SEO靠我batches = valid_set_x.get_value(borrow=True).shape[0] n_test_batches = test_set_x.get_value(boSEO靠我rrow=True).shape[0] n_train_batches /= batch_size n_valid_batches /= batch_size n_SEO靠我test_batches /= batch_size # allocate symbolic variables for the data index = T.lscalar(SEO靠我) # index to a [mini]batch x = T.matrix(x) # the data is presented as rasterized images SEO靠我 y = T.ivector(y) # the labels are presented as 1D vector of # [int] laSEO靠我bels ishape = (28, 28) # this is the size of MNIST images ###################### SEO靠我# BUILD ACTUAL MODEL # ###################### print ... building the model # ReshaSEO靠我pe matrix of rasterized images of shape (batch_size,28*28) # to a 4D tensor, compatible with oSEO靠我ur LeNetConvPoolLayer layer0_input = x.reshape((batch_size, 1, 28, 28)) # Construct the SEO靠我first convolutional pooling layer: # filtering reduces the image size to (28-5+1,28-5+1)=(24,2SEO靠我4) # maxpooling reduces this further to (24/2,24/2) = (12,12) # 4D output tensor is thusSEO靠我 of shape (batch_size,nkerns[0],12,12) layer0 = LeNetConvPoolLayer(rng, input=layer0_input, SEO靠我 image_shape=(batch_size, 1, 28, 28), filter_shape=(nkerns[0], 1, 5, 5), poSEO靠我olsize=(2, 2)) # Construct the second convolutional pooling layer # filtering reduces thSEO靠我e image size to (12-5+1,12-5+1)=(8,8) # maxpooling reduces this further to (8/2,8/2) = (4,4) SEO靠我 # 4D output tensor is thus of shape (nkerns[0],nkerns[1],4,4) layer1 = LeNetConvPoolLayerSEO靠我(rng, input=layer0.output, image_shape=(batch_size, nkerns[0], 12, 12), SEO靠我filter_shape=(nkerns[1], nkerns[0], 5, 5), poolsize=(2, 2)) # the TanhLayer being fully-connecSEO靠我ted, it operates on 2D matrices of # shape (batch_size,num_pixels) (i.e matrix of rasterized iSEO靠我mages). # This will generate a matrix of shape (20,32*4*4) = (20,512) layer2_input = laySEO靠我er1.output.flatten(2) # construct a fully-connected sigmoidal layer layer2 = HiddenLayerSEO靠我(rng, input=layer2_input, n_in=nkerns[1] * 4 * 4, n_out=500, activation=TSEO靠我.tanh) # classify the values of the fully-connected sigmoidal layer layer3 = LogisticRegSEO靠我ression(input=layer2.output, n_in=500, n_out=10) # the cost we minimize during training is theSEO靠我 NLL of the model cost = layer3.negative_log_likelihood(y) # create a function to computSEO靠我e the mistakes that are made by the model test_model = theano.function([index], layer3.errors(SEO靠我y), givens={ x: test_set_x[index * batch_size: (index + 1) * batch_SEO靠我size], y: test_set_y[index * batch_size: (index + 1) * batch_size]}) validatSEO靠我e_model = theano.function([index], layer3.errors(y), givens={ x: valSEO靠我id_set_x[index * batch_size: (index + 1) * batch_size], y: valid_set_y[index * batSEO靠我ch_size: (index + 1) * batch_size]}) # create a list of all model parameters to be fit by gradSEO靠我ient descent params = layer3.params + layer2.params + layer1.params + layer0.params # crSEO靠我eate a list of gradients for all model parameters grads = T.grad(cost, params) layers = SEO靠我[layer0, layer1, layer2, layer3] # train_model is a function that updates the model parametersSEO靠我 by # SGD Since this model has many parameters, it would be tedious to # manually createSEO靠我 an update rule for each model parameter. We thus # create the updates list by automatically lSEO靠我ooping over all # (params[i],grads[i]) pairs. updates = [] for param_i, grad_i in SEO靠我zip(params, grads): updates.append((param_i, param_i - learning_rate * grad_i)) traiSEO靠我n_model = theano.function([index], cost, updates=updates, givens={ x: traiSEO靠我n_set_x[index * batch_size: (index + 1) * batch_size], y: train_set_y[index * batch_siSEO靠我ze: (index + 1) * batch_size]}) ############### # TRAIN MODEL # ############### SEO靠我 print ... training # early-stopping parameters patience = 10000 # look as this manySEO靠我 examples regardless patience_increase = 2 # wait this much longer when a new best is SEO靠我 # found improvement_threshold = 0.995 # a relative improvement of this SEO靠我much is # considered significant validation_frequency = mSEO靠我in(n_train_batches, patience / 2) # go through this many SEO靠我 # minibatche before checking the network SEO靠我 # on the validation set; in this case we # check every epoSEO靠我ch best_params = None best_validation_loss = numpy.inf best_iter = 0 test_scSEO靠我ore = 0. start_time = time.clock() epoch = 0 done_looping = False while (epoSEO靠我ch < n_epochs) and (not done_looping): epoch = epoch + 1 print ...epoch is, epocSEO靠我h, writefile writefile(getDataJson(layers)) for minibatch_index in xrange(n_traiSEO靠我n_batches): iter = (epoch - 1) * n_train_batches + minibatch_index if itSEO靠我er % 100 == 0: print training @ iter = , iter cost_ij = train_model(SEO靠我minibatch_index) if (iter + 1) % validation_frequency == 0: # computSEO靠我e zero-one loss on validation set validation_losses = [validate_model(i) for i SEO靠我 in xrange(n_valid_batches)] this_validation_loSEO靠我ss = numpy.mean(validation_losses) print(epoch %i, minibatch %i/%i, validation errSEO靠我or %f %% % \ (epoch, minibatch_index + 1, n_train_batches, \ SEO靠我 this_validation_loss * 100.)) # if we got the best validation score untiSEO靠我l now if this_validation_loss < best_validation_loss: #improSEO靠我ve patience if loss improvement is good enough if this_validation_loss < best_SEO靠我validation_loss * \ improvement_threshold: patieSEO靠我nce = max(patience, iter * patience_increase) # save best validation score andSEO靠我 iteration number best_validation_loss = this_validation_loss SEO靠我 best_iter = iter # test it on the test set test_loSEO靠我sses = [test_model(i) for i in xrange(n_test_batches)] test_score = numpy.meanSEO靠我(test_losses) print(( epoch %i, minibatch %i/%i, test error of best SEO靠我 model %f %%) % (epoch, minibatch_index + 1, n_trSEO靠我ain_batches, test_score * 100.)) if patience <= iter: SEO靠我 done_looping = True break end_time = time.clock() printSEO靠我(Optimization complete.) print(Best validation score of %f %% obtained at iteration %i,\ SEO靠我 with test performance %f %% % (best_validation_loss * 100., best_iter + 1, test_sSEO靠我core * 100.)) print >> sys.stderr, (The code for file + os.path.sSEO靠我plit(__file__)[1] + ran for %.2fm % ((end_time - start_time) / 60.)) SEO靠我 readfile(layers, nkerns) validation_losses = [validate_model(i) for i SEO靠我 in xrange(n_valid_batches)] this_validation_loss = numpy.mean(validation_losses) SEO靠我print(validation error %f %% % \ (this_validation_loss * 100.)) if __name__ == __main_SEO靠我_: evaluate_lenet5()
网站备案号:浙ICP备17034767号-2