null - 程序员宅基地

手写一个神经网络(不用Pytorch模块)来实现MINIST手写数据集_minset手写数据集-程序员宅基地

文章目录

概览
思路
完整代码
成果展示

概览

这是我们老师之前上课布置的一个小作业，现分享出来，纪念一下。因为当时是用paddle实现的，众所周知，paddle~pytorch，所以想用torch运行的伙伴，把paddle部分改成torch就行。不过比较狗血的是，像损失函数那些，torch中都内置了相关的函数，比如BCEloss，Softmax之类的，我们老师要我们自己写出来。。。还有反向传播的部分也是。

思路

Softmax

Softmax主要还是注意其反向传播过程。

 def value(self, x: np.ndarray) -> np.ndarray:
        n, k = x.shape
        beta = x.max(axis = 1).reshape((n, 1))
        tmp = np.exp(x - beta)
        numer = np.sum(tmp, axis = 1, keepdims = True)
        val = tmp / numer
        return val

Softmax’s Derivate

def derivative(self, x: np.ndarray) -> np.ndarray:
        n, k = x.shape
        D = np.zeros((k, k, n))
        for i in range(n):
            tmp = x[i:i+1, :]
            val = self.value(x)
            D[:,:,i] = np.diag(val.reshape(-1)) - val.T.dot(val)

CrossEntropyLoss

    def value(self, yhat: np.ndarray, y: np.ndarray) -> float:
        yhat = np.clip(yhat, 0.0001, 0.9999) #同逻辑回归的截断操作
        los = -np.mean(np.multiply(np.log(yhat), y) + np.multiply(np.log(1 - yhat), (1 - y)))
        return los

CrossEntropyLoss’s Derivate

    def derivative(self, yhat: np.ndarray, y: np.ndarray) -> np.ndarray:
        der = (yhat - y) / (yhat * (1 - yhat))
        return der

CEwithLogitLoss

这个与普通的BCE不太一样，是将经过softmax后的yhat输入进去。

    def value(self, logits: np.ndarray, y: np.ndarray) -> float:
        n, k = y.shape
        beta = logits.max(axis = 1).reshape((n, 1))
        tmp = logits - beta
        tmp = np.exp(tmp)
        tmp = np.sum(tmp, axis = 1)
        tmp = np.log(tmp+1.0e-40)
        los = -np.sum(y*logits) + np.sum(beta) + np.sum(tmp)
        los = los / n
        return los

CEwithLogitLoss’s Derivate

    def derivative(self, logits: np.ndarray, y: np.ndarray) -> np.ndarray:
        n, k = y.shape
        beta = logits.max(axis = 1).reshape((n, 1))
        tmp = logits - beta
        tmp = np.exp(tmp)
        numer = np.sum(tmp, axis = 1, keepdims = True)
        yhat = tmp/numer
        der = (yhat - y) / n
        return der

弄清楚以上几部分就差不多了。

完整代码

import numpy as np
import struct
import matplotlib.pyplot as plt
import os
from PIL import Image
from sklearn.utils import gen_batches
from sklearn.metrics import classification_report, confusion_matrix
from typing import *
from numpy.linalg import *


train_image_file = './data/data136845/train-images-idx3-ubyte'
train_label_file = './data/data136845/train-labels-idx1-ubyte'
test_image_file = './data/data136845/t10k-images-idx3-ubyte'
test_label_file = './data/data136845/t10k-labels-idx1-ubyte'


def decode_image(path):
    with open(path, 'rb') as f:
        magic, num, rows, cols = struct.unpack('>IIII', f.read(16))
        images = np.fromfile(f, dtype=np.uint8).reshape(-1, 784)
        images = np.array(images, dtype = float)
    return images

def decode_label(path):
    with open(path, 'rb') as f:
        magic, n = struct.unpack('>II',f.read(8))
        labels = np.fromfile(f, dtype=np.uint8)
        labels = np.array(labels, dtype = float)
    return labels

def load_data():
    train_X = decode_image(train_image_file)
    train_Y = decode_label(train_label_file)
    test_X = decode_image(test_image_file)
    test_Y = decode_label(test_label_file)
    return (train_X, train_Y, test_X, test_Y)
trainX, trainY, testX, testY = load_data()

num_train, num_feature = trainX.shape
plt.figure(1, figsize=(20,10))
for i in range(8):
    idx = np.random.choice(range(num_train))
    plt.subplot(int('24'+str(i+1)))
    plt.imshow(trainX[idx,:].reshape((28,28)))
    plt.title('label is %d'%trainY[idx])
plt.show()

# normalize input value between 0 and 1.
trainX, testX = trainX/255, testX/255

# convert all scaler labels to one-hot vectors.
def to_onehot(y):
    y = y.astype(int)
    num_class = len(set(y))
    Y = np.eye((num_class))
    return Y[y]

trainY = to_onehot(trainY)
testY = to_onehot(testY)
num_train, num_feature = trainX.shape
num_test, _ = testX.shape
_, num_class = trainY.shape
print('number of features is %d'%num_feature)
print('number of classes is %d'%num_class)
print('number of training samples is %d'%num_train)
print('number of testing samples is %d'%num_test)
from abc import ABC, abstractmethod, abstractproperty

class Activation(ABC):
    '''
    An abstract class that implements an activation function
    '''
    @abstractmethod
    def value(self, x: np.ndarray) -> np.ndarray:
        '''
        Value of the activation function when input is x.
        Parameters:
          x is an input to the activation function.
        Returns: 
          Value of the activation function. The shape of the return is the same as that of x.
        '''
        return x
    @abstractmethod
    def derivative(self, x: np.ndarray) -> np.ndarray:
        '''
        Derivative of the activation function with input x.
        Parameters:
          x is the input to activation function
        Returns: 
          Derivative of the activation function w.r.t x.
        '''
        return x

class Identity(Activation):
    '''
    Identity activation function. Input and output are identical. 
    '''

    def __init__(self):
        super(Identity, self).__init__()

    def value(self, x: np.ndarray) -> np.ndarray:
        return x
    
    def derivative(self, x: np.ndarray) -> np.ndarray:
        n, m = x.shape
        return np.ones((n, m))
    

class Sigmoid(Activation):
    '''
    Sigmoid activation function y = 1/(1 + e^(x*k)), where k is the parameter of the sigmoid function 
    '''

    def __init__(self, k: float = 1.):
        '''
        Parameters:
          k is the parameter of the sigmoid function.
        '''
        self.k = k
        super(Sigmoid, self).__init__()

    def value(self, x: np.ndarray) -> np.ndarray:
        '''
        Parameters:
          x is a two dimensional numpy array.
        Returns:
          element-wise sigmoid value of the two dimensional array.
        '''
        '''
        #### YOUR CODE BELOW ####
        '''
        val = 1 / (1 + np.exp(np.negative(x * self.k)))
        return val

    def derivative(self, x: np.ndarray) -> np.ndarray:
        '''
        Parameters:
          x is a two dimensional array.
        Returns:
          a two dimensional array whose shape is the same as that of x. The returned value is the elementwise 
          derivative of the sigmoid function w.r.t. x.
        '''
        '''
        #### YOUR CODE BELOW ####
        '''
        val = 1 / (1 + np.exp(np.negative(x * self.k)))
        der = val * (1 - val)
        return der
    
class ReLU(Activation):
    '''
    Rectified linear unit activation function
    '''

    def __init__(self):
        super(ReLU, self).__init__()

    def value(self, x: np.ndarray) -> np.ndarray:
        '''
        #### YOUR CODE BELOW ####
        '''
        val = x*(x>=0)
        return val

    def derivative(self, x: np.ndarray) -> np.ndarray:
        '''
        The derivative of the ReLU function w.r.t. x. Set the derivative to 0 at x=0.
        Parameters:
          x is the input to ReLU function
        Returns:
          elementwise derivative of ReLU. The shape of the returned value is the same as that of x.
        '''
        '''
        #### YOUR CODE BELOW ####
        '''
        der = np.ones(x.shape)*(x>=0)
        return der


class Softmax(Activation):
    '''
    softmax nonlinear function.
    '''

    def __init__(self):
        '''
        There are no parameters in softmax function.
        '''
        super(Softmax, self).__init__()

    def value(self, x: np.ndarray) -> np.ndarray:
        '''
        Parameters:
          x is the input to the softmax function. x is a two dimensional numpy array. Each row is the input to the softmax function
        Returns:
          output of the softmax function. The returned value is with the same shape as that of x.
        '''
        '''
        #### YOUR CODE BELOW ####
        '''
        n, k = x.shape
        beta = x.max(axis = 1).reshape((n, 1))
        tmp = np.exp(x - beta)
        numer = np.sum(tmp, axis = 1, keepdims = True)
        val = tmp / numer
        return val

    def derivative(self, x: np.ndarray) -> np.ndarray:
        '''
        Parameters:
          x is the input to the softmax function. x is a two dimensional numpy array.
        Returns:
          a two dimensional array representing the derivative of softmax function w.r.t. x.
        '''
        n, k = x.shape
        D = np.zeros((k, k, n))
        for i in range(n):
            tmp = x[i:i+1, :]
            val = self.value(x)
            D[:,:,i] = np.diag(val.reshape(-1)) - val.T.dot(val)

##################################################################################################################
# LOSS FUNCTIONS
##################################################################################################################

class Loss(ABC):
    '''
    Abstract class for a loss function
    '''
    @abstractmethod
    def value(self, yhat: np.ndarray, y: np.ndarray) -> float:
        '''
        Value of the empirical loss function.
        Parameters:
          y_hat is the output of a neural network. The shape of y_hat is (n, k).
          y contains true labels with shape (n, k).
        Returns:
          value of the empirical loss function.
        '''
        return 0

    @abstractmethod
    def derivative(self, yhat: np.ndarray, y: np.ndarray) -> np.ndarray:
        '''
        Derivative of the empirical loss function with respect to the predictions.
        Parameters:
          
        Returns:
          The derivative of the loss function w.r.t. y_hat. The returned value is a two dimensional array with 
          shape (n, k)
        '''
        return yhat

class CrossEntropy(Loss):
    '''
    Cross entropy loss function
    '''

    def value(self, yhat: np.ndarray, y: np.ndarray) -> float:
        '''
        #### YOUR CODE BELOW ####
        '''
        #m = yhat.shape[0]
        yhat = np.clip(yhat, 0.0001, 0.9999) #同逻辑回归的截断操作
       # los = -np.sum(y * np.log(yhat)) 
        los = -np.mean(np.multiply(np.log(yhat), y) + np.multiply(np.log(1 - yhat), (1 - y)))
        return los

    def derivative(self, yhat: np.ndarray, y: np.ndarray) -> np.ndarray:
        '''
        #### YOUR CODE HERE ####
        '''

        der = (yhat - y) / (yhat * (1 - yhat))
        return der

class CEwithLogit(Loss):
    '''
    Cross entropy loss function with logits (input of softmax activation function) and true labels as inputs.
    '''
    def value(self, logits: np.ndarray, y: np.ndarray) -> float:
        '''
        #### YOUR CODE BELOW ####
        '''
        #m = y.shape[0]
        #logits = np.clip(logits, 0.0001, 0.9999) #同逻辑回归的截断操作
        #los = (-1/m) * np.sum(y * logits） 
        #los = np.sum(y * logits)- np.log(np.sum(np.exp(logits)))
        #delta = 1e-7
        #los = np.sum(-np.log(logits + delta) * y - np.log(1 - logits + delta) * (1 - y)) / m
        #los = -np.sum(y * np.log(logits) + (1 - y) * np.log(1 - logits)) / m
        n, k = y.shape
        beta = logits.max(axis = 1).reshape((n, 1))
        tmp = logits - beta
        tmp = np.exp(tmp)
        tmp = np.sum(tmp, axis = 1)
        tmp = np.log(tmp+1.0e-40)
        los = -np.sum(y*logits) + np.sum(beta) + np.sum(tmp)
        los = los / n
        return los


    def derivative(self, logits: np.ndarray, y: np.ndarray) -> np.ndarray:
        '''
        #### YOUR CODE BELOW ####
        '''
        #logits = logits - np.max(logits, axis=1, keepdims=True)
        #logits_sum = np.sum(np.exp(logits), axis=1, keepdims=True)
        #print("np.exp(logits)的数值：\n",np.exp(logits))
        #print("logits_sum的数值：\n",logits_sum)
        #val = np.exp(logits) / logits_sum
        #val = np.divide(val, logits_sum, out = np.zeros_like(val), where = logits_sum != 0)
        #val = np.exp(logits) / np.sum(np.exp(logits))
        #logits = np.clip(logits, 0.0001, 0.9999) #同逻辑回归的截断操作
        #val = np.exp(logits) / np.sum(np.exp(logits))
        #der = y - val
        n, k = y.shape
        beta = logits.max(axis = 1).reshape((n, 1))
        tmp = logits - beta
        tmp = np.exp(tmp)
        numer = np.sum(tmp, axis = 1, keepdims = True)
        yhat = tmp/numer
        der = (yhat - y) / n
        return der
In [12]
##################################################################################################################
# METRICS
##################################################################################################################

def accuracy(y_hat: np.ndarray, y: np.ndarray) -> float:
    '''
    Accuracy of predictions, given the true labels.
    Parameters:
      y_hat is a two dimensional array. Each row is a softmax output.
      y is a two dimensional array. Each row is a one-hot vector.
    Returns:
      accuracy which is a float number
    '''
    '''
    #### YOUR CODE HERE ####
    '''
    #max_index = np.argmax(y_hat, axis=1) #找出每行的最大索引
    #y_hat[np.arange(y_hat.shape[0]), max_index] = 1 #直接令其为1
    #acc = np.sum(np.argmax(y_hat, axis=1) == np.argmax(y, axis=1)) #统计个数
    #acc = acc * 1.0 / y.shape[0] #求准确率
    n = y.shape[0]
    acc = np.sum(np.argmax(y_hat, axis = 1) == np.argmax(y, axis = 1)) / n
    return acc

# the following code implements a three layer neural network, namely input layer, hidden layer and output layer
digits = 10 # number of classes
_, n_x = trainX.shape
n_h = 64 # number of nodes in the hidden layer
learning_rate = 0.0001

sigmoid = ReLU() # activation function in the hidden layer
softmax = Softmax() # nonlinear function in the output layer
loss = CEwithLogit() # loss function
epoches = 2000 

# initialization of W1, b1, W2, b2
W1 = np.random.randn(n_x, n_h)
b1 = np.random.randn(1, n_h)
W2 = np.random.randn(n_h, digits)
b2 = np.random.randn(1, digits)

# training procedure
for epoch in range(epoches):
    Z1 = np.dot(trainX, W1) + b1
    A1 = sigmoid.value(Z1)
    Z2 = np.dot(A1, W2) + b2
    A2 = softmax.value(Z2)
    cost = loss.value(Z2, trainY)

    dZ2 = loss.derivative(Z2, trainY)
    dW2 = np.dot(A1.T, dZ2)
    db2 = np.sum(dZ2, axis = 0, keepdims=True)

    dA1 = np.dot(dZ2, W2.T)
    dZ1 = dA1 * sigmoid.derivative(Z1) 
    dW1 = np.dot(trainX.T, dZ1)
    db1 = np.sum(dZ1, axis = 0, keepdims=True)

    W2 = W2 - learning_rate * dW2
    b2 = b2 - learning_rate * db2
    W1 = W1 - learning_rate * dW1
    b1 = b1 - learning_rate * db1


    if (epoch % 100 == 0):
        print("Epoch", epoch, "cost: ", cost)

print("Final cost:", cost)

# testing procedure
Z1 = np.dot(testX, W1) + b1
A1 = sigmoid.value(Z1)
Z2 = np.dot(A1, W2) + b2
A2 = softmax.value(Z2)

predictions = np.argmax(A2, axis = 1)
labels = np.argmax(testY, axis = 1)

print(confusion_matrix(predictions, labels))
print(classification_report(predictions, labels))

# design a neural network class

class NeuralNetwork():
    '''
    Fully connected neural network.
    Attributes:
      n_layers is the number of layers.
      activation is a list of Activation objects corresponding to each layer's activation function.
      loss is a Loss object corresponding to the loss function used to train the network.
      learning_rate is the learning rate.
      W is a list of weight matrix used in each layer.
      b is a list of biases used in each layer.
    '''

    def __init__(self, layer_size: List[int], activation: List[Activation], loss: Loss, learning_rate: float = 0.01) -> None:
        '''
        Initializes a NeuralNetwork object
        '''
        assert len(activation) == len(layer_size), \
        "Number of sizes for layers provided does not equal the number of activation"
        self.layer_size = layer_size
        self.num_layer = len(layer_size)
        self.activation = activation
        self.loss = loss
        self.learning_rate = learning_rate
        self.W = []
        self.b = []
        for i in range(self.num_layer-1):
            W = np.random.randn(layer_size[i], layer_size[i+1]) #/ np.sqrt(layer_size[i])
            b = np.random.randn(1, layer_size[i+1])
            self.W.append(W)
            self.b.append(b)
        self.A = []
        self.Z = []

    def forward(self, X: np.ndarray) -> (List[np.ndarray], List[np.ndarray]):
        '''
        Forward pass of the network on a dataset of n examples with m features. Except the first layer, each layer
        computes linear transformation plus a bias followed by a nonlinear transformation.
        Parameters:
          X is the training data with shape (n, m).
        Returns:
          A is a list of numpy data, representing the output of each layer after the first layer. There are 
            self.num_layer numpy arrays in the list and each array is of shape (n, self.layer_size[i]).
          Z is a list of numpy data, representing the input of each layer after the first layer. There are
            self.num_layer numpy arrays in the list and each array is of shape (n, self.layer_size[i]).
        '''
        num_sample = X.shape[0]
        A, Z = [], []
        for i in range(self.num_layer):
            if i == 0:
                a = X.copy()
                z = X.copy()
            else:
                a = A[-1]
                z = a.dot(self.W[i-1]) + self.b[i-1]
                a = self.activation[i].value(z)
            Z.append(z)
            A.append(a)
        self.A = A
        self.Z = Z
        return Z[-1], A[-1]

    def backward(self, dLdyhat) -> List[np.ndarray]:
        '''
        Backward pass of the network on a dataset of n examples with m features. The derivatives are computed from 
          the end of the network to the front.
        Parameters:
          Z is a list of numpy data, representing the input of each layer. There are self.num_layer numpy arrays in 
            the list and each array is of shape (n, self.layer_size[i]).
          dLdyhat is the derivative of the empirical loss w.r.t. yhat which is the output of the neural network.
            dLdyhat is with shape (n, self.layer_size[-1])
        Returns:
          dZ is a list of numpy array. Each numpy array in dZ represents the derivative of the emipirical loss function
            w.r.t. the input of that specific layer. There are self.n_layer arrays in the list and each array is of 
            shape (n, self.layer_size[i])
        '''
        dZ = []
        for i in range(self.num_layer-1, -1, -1):
            if i == self.num_layer - 1:
                dLdz = dLdyhat
            else:
                dLda = np.dot(dLdz, self.W[i].T)
                dLdz = self.activation[i].derivative(self.Z[i])*dLda# derivative w.r.t. net input for layer i
            dZ.append(dLdz)
        dZ = list(reversed(dZ))
        self.dLdZ = dZ
        return

    def update_weights(self) -> List[np.ndarray]:
        '''
        Having computed the delta values from the backward pass, update each weight with the sum over the training
        examples of the gradient of the loss with respect to the weight.
        :param X: The training set, with size (n, f)
        :param Z_vals: a list of z-values for each example in the dataset. There are n_layers items in the list and
                       each item is an array of size (n, layer_sizes[i])
        :param deltas: A list of delta values for each layer. There are n_layers items in the list and
                       each item is an array of size (n, layer_sizes[i])
        :return W: The newly updated weights (i.e. self.W)
        '''
        for i in range(self.num_layer-1):
            a = self.A[i]
            dW = np.dot(a.T, self.dLdZ[i+1])
            db = np.sum(self.dLdZ[i+1], axis = 0, keepdims = True)
            self.W[i] -= self.learning_rate*dW
            self.b[i] -= self.learning_rate*db
        return
    
    def one_epoch(self, X: np.ndarray,  Y: np.ndarray, batch_size: int, train: bool = True)-> (float, float):
        '''
        One epoch of either training or testing procedure.
        Parameters:
          X is the data input. X is a two dimensional numpy array.
          Y is the data label. Y is a one dimensional numpy array.
          batch_size is the number of samples in each batch.
          train is a boolean value indicating training or testing procedure.
        Returns:
          loss_value is the average loss function value.
          acc_value is the prediction accuracy. 
        '''
        n = X.shape[0]
        slices = list(gen_batches(n, batch_size))
        num_batch = len(slices)
        idx = list(range(n))
        np.random.shuffle(idx)
        loss_value, acc_value = 0, 0
        for i, index in enumerate(slices):
            index = idx[slices[i]]
            x, y = X[index,:], Y[index]
            z, yhat = model.forward(x)   # Execute forward pass
            if train:
                dLdz = self.loss.derivative(z, y)         # Calculate derivative of the loss with respect to out
                self.backward(dLdz)     # Execute the backward pass to compute the deltas
                self.update_weights()  # Calculate the gradients and update the weights
            loss_value += self.loss.value(z, y)*x.shape[0]
            acc_value += accuracy(yhat, y)*x.shape[0]
        loss_value = loss_value/n
        acc_value = acc_value/n
        return loss_value, acc_value
def train(model : NeuralNetwork, X: np.ndarray, Y: np.ndarray, batch_size: int, epoches: int) -> (List[np.ndarray], List[float]):
    '''
    trains the neural network.
    Parameters:
      model is a NeuralNetwork object.
      X is the data input. X is a two dimensional numpy array.
      Y is the data label. Y is a one dimensional numpy array.
      batch_size is the number of samples in each batch.
      epoches is an integer, representing the number of epoches.
    Returns:
      epoch_loss is a list of float numbers, representing loss function value in all epoches.
      epoch_acc is a list of float numbers, representing the accuracies in all epoches.
    '''
    loss_value, acc = model.one_epoch(X, Y, batch_size, train = False)
    epoch_loss, epoch_acc = [loss_value], [acc]
    print('Initialization: ', 'loss %.4f  '%loss_value, 'accuracy %.2f'%acc)
    for epoch in range(epoches):
        if epoch%100 == 0 and epoch > 0: # decrease the learning rate
            model.learning_rate = min(model.learning_rate/10, 1.0e-5)
        loss_value, acc = model.one_epoch(X, Y, batch_size, train = True)
        if epoch%10 == 0:
            print("Epoch {}/{}: Loss={}, Accuracy={}".format(epoch, epoches, loss_value, acc))
        epoch_loss.append(loss_value)
        epoch_acc.append(acc)
    return epoch_loss, epoch_acc

# training procedure
num_sample, num_feature = trainX.shape
epoches = 200
batch_size = 512
Loss = []
Acc = []
learning_rate = 1/num_sample*batch_size
np.random.seed(2022)
model = NeuralNetwork([784, 256, 64, 10], [Identity(), ReLU(), ReLU(), Softmax()], CEwithLogit(), learning_rate = learning_rate)
epoch_loss, epoch_acc = train(model, trainX, trainY, batch_size, epoches)

# testing procedure
test_loss, test_acc = model.one_epoch(testX, testY, batch_size, train = False)
z, yhat = model.forward(testX)
yhat = np.argmax(yhat, axis = 1)
y = np.argmax(testY, axis = 1)
print(yhat.shape, y.shape)
print(confusion_matrix(yhat, y))
print(classification_report(yhat, y))

完整代码中有两部分，一部分是纯手写的，另一部分是用神经网络写的。

成果展示

Initialization:  loss 786.6578   accuracy 0.09
Epoch 0/200: Loss=67.51685728447026, Accuracy=0.6034333333333334
Epoch 10/200: Loss=1.5569211739656414, Accuracy=0.6363833333333333
Epoch 20/200: Loss=1.195973422899852, Accuracy=0.6734166666666667
Epoch 30/200: Loss=1.0514650120496702, Accuracy=0.7037666666666667
Epoch 40/200: Loss=0.9529855833743788, Accuracy=0.7264666666666667
Epoch 50/200: Loss=0.9016336932334414, Accuracy=0.7411833333333333
Epoch 60/200: Loss=0.8326483342395156, Accuracy=0.7559
Epoch 70/200: Loss=0.78866026237516, Accuracy=0.7687
Epoch 80/200: Loss=0.7709660201411853, Accuracy=0.77445
Epoch 90/200: Loss=0.726955942954085, Accuracy=0.7852333333333333
Epoch 100/200: Loss=0.8591143145912691, Accuracy=0.7367
Epoch 110/200: Loss=0.5961169773508302, Accuracy=0.8235333333333333
Epoch 120/200: Loss=0.5877743327835988, Accuracy=0.8259833333333333
Epoch 130/200: Loss=0.5841449699894233, Accuracy=0.8258833333333333
Epoch 140/200: Loss=0.5818215242196157, Accuracy=0.8264
Epoch 150/200: Loss=0.5801588790387723, Accuracy=0.8270833333333333
Epoch 160/200: Loss=0.5788907581218291, Accuracy=0.8273833333333334
Epoch 170/200: Loss=0.5778682345964669, Accuracy=0.8274666666666667
Epoch 180/200: Loss=0.577021751676371, Accuracy=0.82755
Epoch 190/200: Loss=0.5762952337956709, Accuracy=0.8273833333333334
(10000,) (10000,)
[[ 891    0   23   10    0   20   25    7   28   10]
 [   0 1062   11    3    4    2    3    7    5    7]
 [   7   12  848   35   20    9   21   27   27    6]
 [   2   18   16  810    7   70    1    8   31   22]
 [   3    1   19    6  757   14   18   11    8  101]
 [  22    6   13   75   10  616   19    2   53   19]
 [  23    2   20    1   35   46  849    0   19    8]
 [   7    3   21   19    6   10    1  872   14   42]
 [  19   22   45   24    6   62   14    9  756   17]
 [   6    9   16   27  137   43    7   85   33  777]]
              precision    recall  f1-score   support

           0       0.91      0.88      0.89      1014
           1       0.94      0.96      0.95      1104
           2       0.82      0.84      0.83      1012
           3       0.80      0.82      0.81       985
           4       0.77      0.81      0.79       938
           5       0.69      0.74      0.71       835
           6       0.89      0.85      0.87      1003
           7       0.85      0.88      0.86       995
           8       0.78      0.78      0.78       974
           9       0.77      0.68      0.72      1140

    accuracy                           0.82     10000
   macro avg       0.82      0.82      0.82     10000
weighted avg       0.82      0.82      0.82     10000

本文链接：https://blog.csdn.net/weixin_44673253/article/details/125434687

原作者删帖不实内容删帖广告或垃圾文章投诉

智能推荐

hdu 1229 还是A+B（水）-程序员宅基地

文章浏览阅读122次。还是A+BTime Limit: 2000/1000 MS (Java/Others)Memory Limit: 65536/32768 K (Java/Others)Total Submission(s): 24568Accepted Submission(s): 11729Problem Description读入两个小于10000的正整数A和B，计算A+B。...

http客户端Feign——日志配置_feign 日志设置-程序员宅基地

文章浏览阅读419次。HEADERS：在BASIC的基础上，额外记录了请求和响应的头信息。FULL：记录所有请求和响应的明细，包括头信息、请求体、元数据。BASIC：仅记录请求的方法，URL以及响应状态码和执行时间。NONE：不记录任何日志信息，这是默认值。配置Feign日志有两种方式；方式二：java代码实现。注解中声明则代表某服务。方式一：配置文件方式。_feign 日志设置

[转载]将容器管理的持久性 Bean 用于面向服务的体系结构-程序员宅基地

文章浏览阅读155次。将容器管理的持久性 Bean 用于面向服务的体系结构本文将介绍如何使用 IBM WebSphere Process Server 对容器管理的持久性 (CMP) Bean的连接和持久性逻辑加以控制，使其可以存储在非关系数据库..._javax.ejb.objectnotfoundexception: no such entity!

基础java练习题（递归）_java 递归例题-程序员宅基地

文章浏览阅读1.5k次。基础java练习题一、递归实现跳台阶从第一级跳到第n级，有多少种跳法一次可跳一级，也可跳两级。还能跳三级import java.math.BigDecimal;import java.util.Scanner;public class Main{ public static void main(String[]args){ Scanner reader=new Scanner(System.in); while(reader.hasNext()){ _java 递归例题

面向对象程序设计（荣誉）实验一 String_对存储在string数组内的所有以字符‘a’开始并以字符‘e’结尾的单词做加密处理。-程序员宅基地

文章浏览阅读1.5k次，点赞6次，收藏6次。目录1.串应用- 计算一个串的最长的真前后缀题目描述输入输出样例输入样例输出题解2.字符串替换(string)题目描述输入输出样例输入样例输出题解3.可重叠子串 (Ver. I)题目描述输入输出样例输入样例输出题解4.字符串操作（string）题目描述输入输出样例输入样例输出题解1.串应用- 计算一个串的最长的真前后缀题目描述给定一个串，如ABCDAB，则ABCDAB的真前缀有：{ A, AB,ABC, ABCD, ABCDA }ABCDAB的真后缀有：{ B, AB,DAB, CDAB, BCDAB_对存储在string数组内的所有以字符‘a’开始并以字符‘e’结尾的单词做加密处理。

算法设计与问题求解/西安交通大学本科课程MOOC/C_算法设计与问题求解西安交通大学-程序员宅基地

文章浏览阅读68次。西安交通大学/算法设计与问题求解/树与二叉树/MOOC_算法设计与问题求解西安交通大学

随便推点

[Vue warn]: Computed property “totalPrice“ was assigned to but it has no setter._computed property "totalprice" was assigned to but-程序员宅基地

文章浏览阅读1.6k次。问题：在Vue项目中出现如下错误提示：[Vue warn]: Computed property "totalPrice" was assigned to but it has no setter. (found in <Anonymous>)代码：<input v-model="totalPrice"/>原因：v-model命令，因Vue 的双向数据绑定原理，会自动操作 totalPrice，对其进行set 操作而 totalPrice 作为计..._computed property "totalprice" was assigned to but it has no setter.

basic1003-我要通过！13行搞定：也许是全网最奇葩解法_basic 1003 case 1-程序员宅基地

文章浏览阅读60次。十分暴力而简洁的解决方式：读取P和T的位置并自动生成唯一正确答案，将题给测点与之对比，不一样就给我爬！_basic 1003 case 1

服务器浏览war文件,详解将Web项目War包部署到Tomcat服务器基本步骤-程序员宅基地

文章浏览阅读422次。原标题：详解将Web项目War包部署到Tomcat服务器基本步骤详解将Web项目War包部署到Tomcat服务器基本步骤1 War包War包一般是在进行Web开发时，通常是一个网站Project下的所有源码的集合，里面包含前台HTML/CSS/JS的代码，也包含Java的代码。当开发人员在自己的开发机器上调试所有代码并通过后，为了交给测试人员测试和未来进行产品发布，都需要将开发人员的源码打包成Wa..._/opt/bosssoft/war/medical-web.war/web-inf/web.xml of module medical-web.war.

python组成三位无重复数字_python组合无重复三位数的实例-程序员宅基地

文章浏览阅读3k次，点赞3次，收藏13次。# -*- coding: utf-8 -*-# 简述：这里有四个数字，分别是：1、2、3、4#提问：能组成多少个互不相同且无重复数字的三位数？各是多少？def f(n):list=[]count=0for i in range(1,n+1):for j in range(1, n+1):for k in range(1, n+1):if i!=j and j!=k and i!=k:list.a..._python求从0到9任意组合成三位数数字不能重复并输出

ElementUl中的el-table怎样吧0和1改变为男和女_elementui table 性别-程序员宅基地

文章浏览阅读1k次，点赞3次，收藏2次。<el-table-column prop="studentSex" label="性别" :formatter="sex"></el-table-column>然后就在vue的methods中写方法就OK了methods: { sex(row,index){ if(row.studentSex == 1){ return '男'; }else{ return '女'; }..._elementui table 性别

java文件操作之移动文件到指定的目录_java中怎么将pro.txt移动到design_mode_code根目录下-程序员宅基地

文章浏览阅读1.1k次。java文件操作之移动文件到指定的目录_java中怎么将pro.txt移动到design_mode_code根目录下