An increase of violence in public spaces has prompted the introduction of more sophisticated technology to improve the safety and security of very crowded environments. Research disciplines such as civil engineering and sociology, have studied the crowd phenomenon for years, employing human visual observation to estimate the characteristics of a crowd. Computer vision researchers have increasingly been involved in the study and development of research methods for the automatic analysis of the crowd phenomenon. Until recently, most existing methods in computer vision have been focussed on extracting a limited number of features in controlled environments, with limited clutter and numbers of people. The main goal of this thesis is to advance the state of the art in computer vision methods for use in very crowded and cluttered environments. One of the aims is to devise a method that in the near future would be of help in other disciplines such as socio-dynamics and computer animation, where models of crowded scenes are built manually on painstaking visual observation. A series of novel methods is presented here that can learn crowd dynamics automatically by extracting different crowd information from real world crowded scenes and modelling crowd dynamics using computer vision. The developed methods include an individual behaviour classifier, a scene cluttering level estimator, two people counting schemes based on colour modelling and tracking, two algorithm for measuring crowd motion by matching local descriptors, and two dynamics modelling methods - one based on statistical techniques and the other one based on a neural network. The proposed information extracting methods are able to gather both macro information, which represents the properties of the whole crowd, and micro information, which is different from individual (location) to individual (location). The statistically-based dynamics modelling models the scene implicitly. Furthermore, a method for discovering the main path of the crowded scene is developed based on it. Self-Organizing Map (SOM) is chosen in the neural network approach of modelling dynamics; the resulting SOMs are proven to be able to capture the main dynamics of the crowded scene.