Navigating Multimodal Complexity: Advances in Model Design, Dataset Creation, and Evaluation Techniques