We consider a general incompressible finite model protein of size M in its environment, which we represent by a semiflexible copolymer consisting of amino acid residues classified into only two species (H and P, see text) following Lau and Dill. We allow various interactions between chemically unbonded residues in a given sequence and the solvent (water), and exactly enumerate the number of conformations W(E) as a function of the energy E on an infinite lattice under two different conditions: (i) we allow conformations that are restricted to be compact (known as Hamilton walk conformations), and (ii) we allow unrestricted conformations that can also be non-compact. It is easily demonstrated using plausible arguments that our model does not possess any energy gap even though it is supposed to exhibit a sharp folding transition in the thermodynamic limit. The enumeration allows us to investigate exactly the effects of energetics on the native state(s), and the effect of small size on protein thermodynamics and, in particular, on the differences between the microcanonical and canonical ensembles. We find that the canonical entropy is much larger than the microcanonical entropy for finite systems. We investigate the property of self-averaging and conclude that small proteins do not self-average. We also present results that (i) provide some understanding of the energy landscape, and (ii) shed light on the free energy landscape at different temperatures.
Hidden Markov models (HMM) have long been a popular choice for Western cursive handwriting recognition following their success in speech recognition. Even for the recognition of Oriental scripts such as Chinese, Japanese and Korean, hidden Markov models are increasingly being used to model substrokes of characters. However, when it comes to Indie script recognition, the published work employing HMMs is limited, and generally focussed on isolated character recognition. In this effort, a data-driven HMM-based online handwritten word recognition system for Tamil, an Indie script, is proposed. The accuracies obtained ranged from 98% to 92.2% with different lexicon sizes (IK to 20 K words). These initial results are promising and warrant further research in this direction. The results are also encouraging to explore possibilities for adopting the approach to other Indie scripts as well.