To solve the problems of variable selection and architecture design in regression, an extreme learning machine(ELM) based on mutual information is proposed, which can optimize the input layer and the hidden layer simultaneously. The mutual information variable selection is combined with ELM. The performance of the network is used as the criterion of variable selection, and the size of the hidden layer is determined by using the incremental method. Simulation results on two data sets of multivariate time series and 10 benchmark datasets show the effectiveness of the proposed algorithm. The proposed algorithm can not only compact the architecture of the network, but also improve the generalization performance.