This is a blog post that I originally wrote for the edX engineering blog.
At the core of edX is the edx-platform, a monolithic Django code-base 2.7 times the size of Django itself.
-------------------------------------------------------------------------------
Language Files Lines Code Comments Blanks
-------------------------------------------------------------------------------
ActionScript 1 118 74 23 21
Autoconf 10 425 237 163 25
CSS 55 17106 14636 1104 1366
HTML 668 72567 36865 30306 5396
JavaScript 1500 463147 352306 55882 54959
JSON 91 14583 14583 0 0
JSX 33 2595 2209 62 324
LESS 1 949 606 232 111
Makefile 1 65 49 8 8
Markdown 23 287 287 0 0
Mustache 1 1 1 0 0
Python 3277 559255 442756 29254 87245
ReStructuredText 48 4252 4252 0 0
Sass 424 75559 55569 4555 15435
Shell 15 929 505 292 132
SQL 4 6283 5081 1186 16
Plain Text 148 3521 3521 0 0
TypeScript 20 88506 76800 11381 325
XML 364 5283 4757 231 295
YAML 36 1630 1361 119 150
-------------------------------------------------------------------------------
Total 6720 1317061 1016455 134798 165808
-------------------------------------------------------------------------------
35% of the edx-platform is JavaScript. While it has served edX well since its inception in 2012, reaching over 11 million learners in thousands of courses on edX.org and many more millions on all of the Open edX instances across the world, it is starting to show its age. Most of it comes in the form of Backbone.js apps loaded by RequireJS in Django Mako templates, with jQuery peppered throughout.
Many valiant efforts are underway to modernize the frontend of edx-platform including replacing RequireJS with Webpack, Backbone.js with React, and ES5 JavaScript and CoffeeScript with ES6 JavaScript. Many of these efforts were covered in detail at the last Open edX conference and in Open edX Proposal 11: Front End Technology Standards. However, the size and complexity of the edx-platform means that these kind of efforts are hard to prioritize, and, in the meantime, frontend developers are forced to wait over 10 minutes for our home-grown asset pipeline to build before they can view changes.
There have also been efforts to incrementally modularize and extract parts of the edx-platform into separate python packages that could be installed as Django apps, or even as separately deployed microservices. This allows developers to work independently from the rest of the organization inside of a repository that they own, manage, and is small enough that they could feasibly understand it entirely.
When my team was tasked with improving the user experience of pages in Studio, the tool that course authors use to create course content, we opted to take a similar architectural approach with the frontend and create a new repository where we could develop new pages in isolation and then integrate them back into the edx-platform as a plugin. We named this new independent repository studio-frontend. With this approach, our team owns the entire studio-frontend code-base and can make the best architectural changes required for its features without having to consult with and contend with all of the other teams at edX that contribute to the edx-platform. Developers of studio-frontend can also avoid the platform’s slow asset pipeline by doing all development within the studio-frontend repository and then later integrating the changes into platform.
When edX recently started to conform our platform to the Web Content Accessibility Guidelines 2.0 AA (WCAG 2.0 AA), we faced many challenges in retrofitting our existing frontend code to be accessible. Rebuilding Studio pages from scratch in studio-frontend allows us to not only follow the latest industry standards for building robust and performant frontend applications, but to also build with accessibility in mind from the beginning.
The Javascript community has made great strides recently to address accessibility issues in modern web apps. However, we had trouble finding an open-source React component library that fully conformed to WCAG 2.0 AA and met all of edX’s needs, so we decided to build our own: Paragon.
Paragon is a library of building-block components like buttons, inputs, icons, and tables which were built from scratch in React to be accessible. The components are styled using the Open edX theme of Bootstrap v4 (edX’s decision to adopt Bootstrap is covered in OEP-16). Users of Paragon may also choose to use the themeable unstyled target and provide their own Bootstrap theme.
Studio-frontend composes together Paragon components into higher-level components like an accessibility form or a table for course assets with searching, filtering, sorting, pagination, and upload. While we developed these components in studio-frontend, we were able to improve the base Paragon components. Other teams at edX using the same components were able to receive the same improvements with a single package update.
We were able to follow the typical best practices for developing a React/Redux application inside studio-frontend, but at the end of the day, we still had to somehow get our components inside of existing Studio pages and this is where most of the challenges arose.
The aforementioned move from RequireJS to Webpack in the edx-platform made it possible for us to build our studio-frontend components from source with Webpack within edx-platform. However, this approach tied us to the edx-platform’s slow asset pipeline. If we wanted rapid development, we had to duplicate the necessary Webpack config between both studio-frontend and edx-platform.
Instead, studio-frontend handles building the development and production Webpack
builds itself. In development mode, the incremental rebuild that happens
automatically when a file is changed takes under a second. The production
JavaScript and CSS bundles, which take about 25 seconds to build, are published
with every new release to
NPM. The edx-platform
npm install
s studio-frontend and then copies the built production files from
node_modules
into its Django static files directory where the rest of the
asset pipeline will pick it up.
To actually use the built JavaScript and CSS, edx-platform still needs to include it in its Mako templates. We made a Mako template tag that takes a Webpack entry point name in studio-frontend and generates script tags that include the necessary files from the studio-frontend package. It also dumps all of the initial context that studio-frontend needs from the edx-platform Django app into a JSON object in a script tag on the page that studio-frontend components can access via a shared id. This is how studio-frontend components get initial data from Studio, like which course it’s embedded in.
For performance, modules that are shared across all studio-frontend components
are extracted into common.min.js
and common.min.css
files that are included
on every Studio template that has a studio-frontend component. User’s browsers
should cache these files so that they do not have to re-download libraries like
React and Redux every time they visit a new page that contains a studio-frontend
component.
Since the move to Bootstrap had not yet reached the Studio part of the
edx-platform, most of the styling clashed with the Bootstrap CSS that
studio-frontend components introduced. And, the Bootstrap styles were also
leaking outside of the studio-frontend embedded component div
and affecting
the rest of the Studio page around it.
We were able to prevent styles leaking outside of the studio-frontend component
by scoping all CSS to only the div
that wraps the component. Thanks to the
Webpack postcss-loader and the
postcss-prepend-selector
we were able to automatically scope all of our CSS selectors to that div
in
our build process.
Preventing the Studio styles from affecting our studio-frontend component was a
much harder problem because it means avoiding the inherently cascading nature of
CSS. A common solution to this issue is to place the 3rd-party component inside
of an
iframe
element, which essentially creates a completely separate sub-page where both CSS
and JavaScript are isolated from the containing page. Because iframe
s
introduce many other performance and styling issues, we wanted to find a
different solution to isolating CSS.
The CSS style all:
initial
allows
resetting all properties on an element to their initial values as defined in the
CSS spec. Placing this style under a wildcard selector in studio-frontend
allowed us to reset all inherited props from the legacy Studio styles without
having to enumerate them all by hand.
* {
all: initial;
}
While this CSS property doesn’t have broad browser support yet, we were able to polyfill it thanks to postcss with the postcss-initial plugin.
However, this resets the styles to nothing. For example, all div
s are
displayed inline. To return the styles back to to some sane browser default we
had to re-apply a browser default stylesheet. You can read more about this
technique at
default-stylesheet.
From there, Bootstrap’s reboot normalizes the browser-specific styling to a common baseline and then applies the Bootstrap styles conflict-free from the surrounding CSS cascade.
There’s a candidate recommendation in CSS for a contains
property, which will “allow strong,
predictable isolation of a subtree from the rest of the page”. I hope that it
will provide a much more elegant solution to this problem once browsers support
it.
Another major challenge with separating out the frontend from edx-platform was that most of our internationalization tooling was instrumented inside the edx-platform. So, in order to display text in studio-frontend components in the correct language we either had to pass already-translated strings from the edx-platform into studio-frontend, or set-up translations inside studio-frontend.
We opted for the latter because it kept the content close to the code that used it. Every display string in a component is stored in a displayMessages.jsx file and then imported and referenced by an id within the component. A periodic job extracts these strings from the project, pushes them up to our translations service Transifex, and pulls any new translations to store them in our NPM package.
Because Transifex’s KEYVALUEJSON
file format does not allow for including
comments in the strings for translation, Eric
created a library called reactifex
that will send the comments in separate API calls.
Studio includes the user’s language in the context that it sends a
studio-frontend component for initialization. Using this, the component can
display the message for that language if it exists. If it does not, then it will
display the original message in English and wrap it in a span
with lang="en"
as an
attribute
so that screen-readers know to read it in English even if their default is some
other language.
Read more about studio-frontend’s internationalization process in the documentation that Eric wrote.
To normalize the development environment across the whole studio-frontend team, development is done in a Docker container. This is a minimal Ubuntu 16.04 container with specific version of Node 8 installed and its only purpose is to run Webpack. This follows the pattern established in OEP-5: Pre-built Development Environments for running a single Docker container per process that developers can easily start without installing dependencies.
Similar to edX’s devstack there is a Makefile
with commands to start and stop the docker container. The docker container then
immediately runs npm run
start
,
which runs Webpack with the
webpack-dev-server. The
webpack-dev-server is a node server that serves assets built by Webpack.
Studio-frontend’s Webpack
config
makes this server available to the developer’s host machine
at http://localhost:18011
.
With hot-reload enabled, developers can now visit that URL in their browser, edit source files in studio-frontend, and then see changes reflected instantly in their browser once Webpack finishes its incremental rebuild.
However, many studio-frontend components need to be able to talk to the
edx-platform Studio backend Django server. Using docker’s network connect
feature
the studio-frontend container can join the developer’s existing docker devstack
network so that the studio-frontend container can make requests to the docker
devstack Studio container at http://edx.devstack.studio:18010/
and Studio can
access studio-frontend at http://dahlia.studio-fronend:18011/
.
The webpack-dev-server can now proxy all
requests
to Studio API endpoints (like http://localhost:18011/assets
)
to http://edx.devstack.studio:18010/
.
Since studio-frontend components will be embedded inside of an existing Studio
page shell, it’s often useful to develop on studio-frontend containers inside of
this set-up. This can be
done
by setting a variable in the devstack’s cms/envs/private.py
:
STUDIO_FRONTEND_CONTAINER_URL = 'http://localhost:18011'
This setting is checked in the Studio Mako templates wherever studio-frontend
components are embedded. If it is set to a value other than None
, then the
templates will request assets from that URL instead of the Studio’s own static
assets directory. When a developer loads a Studio page with an embedded
studio-frontend component, their studio-frontend webpack-dev-server will be
requested at that URL. Similarly to developing on studio-frontend in isolation,
edits to source files will trigger a Webpack compilation and the Studio page
will be hot-reloaded or reloaded to reflect the changes automatically.
Since the studio-frontend JS loaded on localhost:18010
is now requesting the
webpack-dev-server on localhost:18011
,
an Access-Control-Allow-Origin
header
has to be configured on the webpack-dev-server to get around CORS violations.
Each release of
studio-frontend will upload
the /dist
files built by Webpack in production mode to
NPM. edx-platform
requires a particular version of studio-frontend in its
package.json
.
When a new release of edx-platform is made, paver update_assets
will run
which will copy all of the files in the
node_modules/@edx/studio-frontend/dist/
to the Studio static folder.
Because STUDIO_FRONTEND_CONTAINER_URL
will be None
in production, it will be
ignored, and Studio pages will request studio-frontend assets from that static
folder.
Instead of “bringing the new into the old”, we’d eventually like to move to a model where we “work in the new and bring in the old if necessary”. We could host studio-frontend statically on a completely separate server which talks to Studio via a REST (or GraphQL) API. This approach would eliminate the complexity around CSS isolation and bring big performance wins for our users, but it would require us to rewrite more of Studio.