https://eugeneyan.com/rss 09月30日 19:15
学习Scala函数式编程的收获与思考
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文作者分享了在Lazada数据科学团队使用Spark的经验,并阐述了其参加Coursera上由Scala语言设计者Martin Odersky教授的“函数式编程 in Scala”课程的经历和收获。课程内容涵盖了Scala的基础语法、函数式编程概念、数据结构以及递归等,并强调了软件工程实践中的单元测试。作者认为,递归思维是课程中的主要挑战,而单元测试和问题分解能力是本次学习的重要附加收获,对提升数据科学工作效率和代码健壮性大有裨益。文章最后推荐了Scala School和Scala Exercises等进一步学习资源。

🎓 课程概览与目标:作者在Lazada数据科学团队工作中常处理大数据,故选择学习Scala。Coursera上的“函数式编程 in Scala”课程由Scala语言设计者Martin Odersky亲自授课,旨在提供结构化的Scala基础和语法学习,帮助作者摆脱仅依赖Spark和Stack Overflow自学的方式。

💡 核心内容与挑战:课程为期六周,涵盖了函数式编程设计原则(如call-by-value vs. call-by-name)、Scala编程概念(高阶函数、类、对象、特质、类型)、类层次结构与数据结构(列表、向量、映射)、基本语法、模式匹配以及子类型与方差。作者认为,最大的挑战并非Scala语法或编译语言本身,而是通过递归来思考和解决问题的逻辑,特别是对尾递归的深入理解和应用。

🛠️ 重要实践与附加收获:除了Scala语言和函数式编程范式,作者还获得了两项宝贵的软件工程实践经验:1. 单元测试(使用ScalaTest):从第一周起就被教授和鼓励编写单元测试,通过测试来形式化和检查代码中的边缘情况,使代码更健壮。2. 问题分解与尾递归:通过课程的实践,作者学会了如何将复杂问题分解并用尾递归的方式解决,这有效解决了Spark中因栈溢出(stack overflow error)而引发的问题,并提升了效率。作者也提到,在实际工作中,除非必要,可能不会主动优先考虑递归解决方案。

🚀 未来规划与资源推荐:作者认为该课程对于理解Scala的设计理念和更有效地使用Scala非常有帮助。Martin Odersky在结课时推荐了Scala School(涵盖基础、集合、SBT等)和Scala Exercises(通过浏览器交互式练习学习Scala特性)作为进阶学习资源,作者认为Scala Exercises在提升Spark相关软件工程技能方面更为实用。

At Lazada’s Data Science team, I use Spark a fair bit, especially when the data gets big (e.g., online behavioural and transaction data). While PySpark, the Python API for Spark was available when I started, I decided early on to code in Scala. Perhaps I relished the challenge or just wanted to pick up a new language.

Why take the Functional Programming in Scala course?

Before the course, my programming skills in Scala were mainly self taught, through the school of hard knocks and stackoverflow. Thus, when the course was made available on Coursera, I saw the opportunity to learn about the fundamentals of Scala (away from Spark) and its syntax in a structured fashion.

How is the course like?

The course is taught by Martin Odersky, designer of the Scala programming language. It follows a structure commonly found in MOOCs—approximately two hours of lectures (more theoretical) and a lab assignment taking three hours (more practical) weekly.

Over the course of six weeks, Martin taught about:

    Functional programming design (call-by-value vs. call-by-name, def vs. val, etc) Scala programming concepts (higher order functions, classes, objects, traits, types, etc) Scala’s class hierarchy and data structures (lists, vectors, maps, etc) Basic Scala syntax Pattern matching (no, not regex) Sub typing, Variance (covariant, contravariant, non variant)

I found the main challenge not to be Scala’s syntax, or working with a compiled language. Rather, the main challenge was thinking through the logic of solving problems through recursion. While I’ve wrote recursive algorithms before, I haven’t quite grokked it yet.

In the course, almost all assignments were solved through tail-recursion. At work, I mostly think about data in the form of tables, strings, or graphs—solving problems recursively doesn’t come up much.

What did I learn from the course?

While the course focused on the scala language and functional programming paradigm, I gained two other lessons that I value just as much.

There was a lot of emphasis on a key software engineering practice—testing (using ScalaTest). Beginning in week one, the practice of writing unit tests was taught and encouraged. Throughout the course, Martin actively shared about edge cases in the code, and how they can be formalized and easily checked consistently in a unit test.

The lab assignments progressively taught more sophisticated ScalaTest methods, and how to test more effectively. Improving on the practice of testing will make my code more robust, my work more efficient, and me a better data scientist in the long run.

I also gained practice in breaking problems down and solving them through tail recursion. I’ve come across user-defined functions in Spark that lead to a stack overflow error when executed. Putting in additional thought and writing them in a tail recursive fashion fixed this issue and also led to efficiency improvements. Nonetheless, I’ll probably won’t be actively thinking about recursive solutions at work unless absolutely necessary.

What’s next?

The course was excellent for learning about the thinking that went into the design of Scala as a functional language, and how to use Scala more effectively.

At the concluding lecture, Martin recommended additional learning resources. Two are worth highlighting here. First, there’s the Scala School by Twitter that covers the basics, collections, simple build tool (SBT) and more. Martin also recommended the Scala Exercises by 47 Degrees that covers more features of Scala through solving simple exercises in the browser interactively. I find Scala Exercises to be more practical and likely to improve my software engineering skills in Spark more.

Conclusion

I highly recommend this short six-week course if you would like to learn the basics of Scala from the designer of Scala himself. Martin is fantastic teacher and taught effectively through online videos and lab assignments. The forums were also very helpful. Here, you’ll find people who are stuck on the same problem as you are, and teaching assistants providing helpful hints.

Questions? Want to follow my journey? Reach out on Twitter @eugeneyan!

If you found this useful, please cite this write-up as:

Yan, Ziyou. (Jul 2016). Thoughts on Functional Programming in Scala Course (Coursera). eugeneyan.com. https://eugeneyan.com/writing/thoughts-on-functional-programming-in-scala-course-coursera/.

or

@article{yan2016scala,  title   = {Thoughts on Functional Programming in Scala Course (Coursera)},  author  = {Yan, Ziyou},  journal = {eugeneyan.com},  year    = {2016},  month   = {Jul},  url     = {https://eugeneyan.com/writing/thoughts-on-functional-programming-in-scala-course-coursera/}}
Share on:

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Scala 函数式编程 Coursera Martin Odersky Spark PySpark 数据科学 软件工程 单元测试 递归 Functional Programming Data Science Software Engineering Unit Testing Recursion
相关文章