The concept of information entropy is introduced in the limit of "Big Data". Consistent with the notion in physics, it is a Eulerian homogeneous degree-one function of its independent variables, together with a variational principle. Through constrained optimization, Legendre-Fenchel transform leads to a geometric view of the entropy theory. I shall discuss three interesting discoveries: A non-logarithmic entropy function for Markov counting, an information (free energy) manifold $I_F$, and the Minkowski geometry for the polar set of $I_F$. This is a joint work with Bing Miao and Yong-Shi Wu.