Automatic Generation

of City Layouts





Jim Susinno October 31, 2003

Advisor: Subodh Kumar


This qualifies as a Masters Project


































Jim Susinno 2003

Automatic Generation of City Layouts




This paper presents a randomized approach to the automatic modeling of large urban environments.  Building layout zones are derived in 2D using split grammars and are populated with rows of structures, selected from a list of polygon meshes.  Zoning constraints may be placed within the split grammar, controlling the model selected for zone population and how densely instances of it are packed.  Roads and other flat features may also be generated by the split grammar in a programmable fashion.  City components are raised according to a given heightmap and can be displayed within the system at non-interactive rates or exported to a more sophisticated rendering engine or visualization system.





This paper addresses the problem of automatic city modeling as a tool for the creation of fictitious cities.  The work is motivated by the desire to test a large-scale dataset visualization system[vLOD] with data that resembles real-world urban geometry.  The ultimate goal is to be able to quickly generate a randomized city layout with as little input from the user as possible.


The principal tool used in realizing this automatic generation goal is the split grammar.  The grammar is an elegant tool allowing for varying degrees of control over generated output.  Cities may be deterministically generated according to fixed rules, or very loosely specified and allowed to grow in a largely random fashion.  Particular rules can be weighted above other rules, encouraging their appearance in a generated city to a specified degree.  Individual grammar symbols can be constructed to include themselves in their own productions, resulting in potentially infinite subdivision of zones, though this approach often does not produce realistic cities.  A relatively small set of appropriate zoning rules can be employed to produce an unlimited number of unique cities with similar layout heuristics.  City split grammars are specified in ascii text format and can be edited by hand and passed to the system at runtime.




Related Work

The vLOD system uses visibility computation and Level-of-detail information to render large out-of-core datasets at interactive rates by only fetching the necessary geometry to draw the scene to within a specified amount of screen-space error.  The dataset’s model space is divided into a regular 3D grid of regions each with an associated collection of connected components that are in view from that particular region, and each connected component’s appropriate level of detail.  It is the hope of the designers of the system that vLOD will be capable of displaying large city datasets, such as the ones produced here, at interactive rates.


In “Instant Architecture”[1], Wonka et al. introduce the notion of split grammars and employ them to generate realistic building fronts.  Starting with a generic rectangular building face, a specialized set of architectural split rules is used to break down a building’s face into components and potentially further into subcomponents.  Components may exhibit coherence along groups by producing models similar in appearance, such as a row of identical windows, or by applying larger geometric features at higher levels of subdivision, such as a decorative band or cornice.  The system is not limited to rectangular faces, and is capable of producing widely varying types of architecture, including cylindrical shapes and non-trivial floorplans and connectivity arrangements.



In “Procedural Modeling of Cities”[2], Parish and Muller describe a method of road generation as the first step in the approach to building city layouts.  Their system takes added input in the form of terrain and population maps and uses this information to lay roads optimally according to defined rules in an Open L-System modified to form closed loops.  Roads may be generated according to any of several rules, including radial layout, grid layout, and elevation-dependent layout.  Once roads are generated, lots of space between roads are subdivided by a method similar to the approach given in this paper and passed to the building generating L-System.  Structures are then generated and textured according to geometric rules in the L-System and placed within the lots.


In “Real-time Procedural Generation of ‘Pseudo Infinite’ Cities”[4], Greuter et al. use a randomized approach to generate content at runtime.  Structures are generated on-the-fly to fill the area in the viewing frustum and are deleted from memory once they fall out of view.  Upon returning to the viewing frustum, structures are re-generated, and since the seed for this generation is derived from the structure’s position, the same structure is generated every time, guaranteeing geometric consistency of the model.  This approach is unique in its sparing usage of memory, but makes no provisions for the generation of roads in an orderly fashion resembling that of today’s cities.  Generated floor plans are always densely packed, producing cities of a very high density.


The methods presented in “Scene Assembly for Large Scale Urban Reconstructions”[3] apply to the slightly different goal of historical city modeling.  In such a system, the desired output is an exact replica of a specific city, or one as exact as possible given the limited knowledge of the city’s appearance one may have.  The first step in reconstructing a city is providing a heightmap via one of several existing terrain formats.  Once provided with a heightmap, the Scene Assembler places roads on the map by connecting nodes with segments.  Segments may be linear or Bezier curves, and each node functions as a junction point at which multiple roads can connect.  When segments connect to each other to form polygons(possibly concave but NOT self-intersecting), regions are formed in the area enclosed by the segments.  Objects may be placed in the scene manually or automatically.  Automatic placement options are along roads or within regions.  Houses and structures are normally placed along roads while trees and other natural objects are placed within regions.


Laycock and Day present methods for building footprint and roof generation in “Automatically Generating Large Urban Environments based on the Footprint Data of Buildings”[5].  Footprints are first aggregated using sets of line segments, and are then extruded vertically to form walls.  These rectilinear figures are then given roofs by one of three approaches which resemble the roof of a small house.  This style of architecture seems best suited for a suburban residential area where angled roofs are prevalent, and less applicable in an inner city environment where apartment and office buildings preside with flat roofs or spires.



The city generation system is currently implemented to take geometric input in the form of user-defined polygon, although these points may come from any source.  Only non-overlapping, potentially non-convex polygons are accepted as sane input.  Within the current system, points are given to the input program as mouse clicks in the input window sequentially connected by lines.  Once the input points are all given, the input polygon is triangulated using Triangulation Code by Atul Narkhede and Dinesh Manocha[8].  Once the input polygon is triangulated, all subsequent triangular faces are transferred into a winged-edge list.  The purpose of the winged-edge list is to merge any pair of triangular faces that closely resembles a rectangle in order to encourage the existence of quadrilaterals, which are easily and elegantly subdivided.  Each face in the winged-edge list(consisting of triangles and quads) becomes the root node of a layout tree, which is grown according to a split grammar.  Every node in the layout tree corresponds to a specific zone(area) in the xz-plane. 


A zone is an area over the xz plane in the map, denoting a particular type of structure with which that area is to be populated.  A typical zone is illustrated above in figure 1.  Zones are represented as lists of floating point pairs and may contain 3 or 4 of these pairs(triangle and quad zones).  Each zone also has a type, represented as a string, which is used to identify applicable grammar rules and to select an appropriate structure to fill its area.


Each leaf node in a layout tree is grown by checking the specified grammar for rules applicable to layout nodes of its type.  Once found, the appropriate grammar rule generates a collection of nodes as the children of the current node, and the next generation step is applied to each of the children in turn until no more applicable rules are found.  A layout node for which there exist no applicable split rules in the grammar is known as a terminal node.  For each terminal node in the layout tree, the zone which it represents is filled with instances of some particular model chosen based on the type of the zone.  Different types of zones include Commercial, Residential, Industrial, Park and Road.  Parks and Roads are designated as empty zones not to be populated with structures.  Other zone types may be modified by the suffixes _Dense or _Sparse to control the spacing between placed instances of the selected model.







Procedure: City Layout Generation

Step 1: Plot vertices of a non-self-intersecting polygon in 2D space.

Step 2: Triangulate the input polygon.

Step 3: Populate a winged edge list with the results of triangulation

Step 4: Evaluate all edges for a merge condition (if angle b/w wings ~= 90degrees).

Step 5: Create a zone of type “Town” from each face in winged edge list.

Step 6: Apply grammar rules until all leaf nodes are terminal.

Step 7: Populate zones with structures.






The Town split rule(figure 2):

1.0 Town 4 Inner<e;f;g;h>  Road_1<a;b;f;e> Road_2<b;c;g;f> Road_3<c;d;h;g> Road_4<d;a;e;h>


An example split-in-half rule(figure 2):

1.0   Inner 4  Zone_1<k;p;q;j>  Zone_2<o;l;m;n> Road<l;o;p;k>






Split Grammars

The split grammar is represented by a series of rules.  Each rule has a left symbol and a list of one or more right symbols.  Rules also contain weights which are factored in the rule selection process; rules with higher weights are chosen more often than those with lower weights.  The right symbols are defined by sets of floating point pairs indicating points within the parametric space of the current left symbol.  Rules producing triangular child nodes contain 6 parameters, rules producing quads contain 8.  Rules applying to figures with 5 sides or more are possible, but dismissed in favor of the simpler 3 and 4 sided rules illustrated in figures 2 and 3.

A rule is denoted in a grammar file by a list of whitespace-delimited tokens: first the weight, left symbol name and number of parameters, then a list of child nodes.  Each child node is a comma-delimited set of tokens: first the produced node’s name, then a list of floating point doubles indicating the node’s shape in the parameter space of its parent node.  Parameters are converted to world space at production time using bilinear interpolation.  Many of the basic shape rules are quite simply expressed in triangular or quadrilateral parameter space.  This method of expression retains its effectiveness when used on elongated or irregular zones, and is capable of handling zones of arbitrary rotation.  It also allows for a carefully written grammar to ensure that all interior space of every layout node in a city will be filled by enforcing this criterion in each individual rule.



However, the output of a split grammar complicates the question of adjacency of nodes.  This question could be greatly simplified by the integration of the winged edge list structure into the production process, at the cost of potentially large amount of memory as the numbers of edges, faces and vertices increase with each subdivision.




Principal split rule

The principal splitting rule(see figure 2) surrounds a region with roads.  The use of this particular rule once at the top level of layout generation ensures complete road coverage of all generated children of the inner node.  Secondary splitting rules establish the main geometric properties of the city by defining how different types of nodes may be split.  There is one set of basic splitting rules for triangles and one for quads.  Secondary rules contain within them a specialized list of zones, which contain the string “Road” in their name.  These roads should always be written long edge first in the grammar file, and it is worth remarking about the small sections where roads intersect.  These are denoted as symbols of type ”Road_Intersection” within the grammar and do not undergo further subdivision.


These rules also contain zoning information, and may be replicated within the grammar to allow for variation in zoning of generated cities.  For instance, the same quadrant split rule may be listed more than twice in the grammar, once denoting a split into 2 residential and 2 commercial zones, once including an industrial zone, and perhaps many more times.  The use of specific symbols for zone types containing depth information is employed to exercise more control over the development of cities, but causes the undesired feature of added redundancy within the grammar file.  A slightly reduced selection of the original shape rules has been replicated 3 times for 3 levels of depth within one example grammar in order to insure that the generation process terminates at a depth of 3, as in figure 4.


Density Rules

Each node at a level 3 subdivision is designated as Residential Commercial or Industrial.  The further application of another set of geometrically trivial grammar rules randomizes the density of population of the produced node and also has the capability of generating empty “Park” zones from residentially designated areas.  These rules are illustrated in figure 5.  Each of these zones occupies the full [(0,0)(1.1)] parametric domain so the singular child node has the same shape as the parent.




Road Rules

Each road begins as a long polygon and is subdivided along its length according to 3 or 4 specified road rules, depicted in figure 6.  It is important in the construction of a city grammar that the first edge listed of each road is one of the long edges.  This allows for convenient subdivision of roads along length, and since height is evaluated per-vertex, allows roads to follow the heightmap more closely.  In practice, 2 or 3 levels of 4-way subdivision yields an accurate roadway that still consists of a reasonable number of polygons on a typical input.



  Adding extra levels of subdivision beyond this point is prohibitive in the explosion of the number of road polygons required to draw the scene.  A final rule splits the road along its length into 2 lanes and a yellow center stripe for an easily applied aesthetic effect.


Populating Zones

When a zone reaches terminal status, it is populated with objects.  The system chooses a set of appropriate models for each node and iterates over the xz plane, placing a random object from the chosen set at each location where it is deemed the object will fit entirely within the zone.  The axes of iteration are aligned with the principal edge of the zone so buildings line up along roads.  Models are translated to their given location along the xz plane, raised to the value of the heightmap function at that point, and rotated to present their front face to the edge along which they are aligned.  A buffer zone of space is modified by the presence or absence of the strings “Dense” or “Sparse” and defaults to 2 or 3 times the width of the model’s bounding box.






One sample split grammar implemented by hand in conjunction with a pre-fabricated heightmap has been used to create scores of unique, realistic city models.  The sample grammar consists of 13 distinct shape splitting rules, which are replicated for use in macroscopic and microscopic contexts.  Of these 13 basic rules, 6 apply to quadrilaterals and 3 apply to triangles


A typical city will generate placements for 1000-4000 objects and 100-200 stretches of road.  The total polygon count of generated models ranges from approximately 6-10 million.  Objects are grouped in such a way as not to appear too regular, but to exhibit some chaotic aspects of layout amidst their order.


Generated cities can be rendered using bounding boxes in place of models at interactive rates on a Pentium 4 1.6GHz with Geforce4Go.  When replaced by the full polygonal meshes ranging from 30-20000 polys each, rendering time increases to 5-10 seconds per frame.  It is believed that the same data can be rendered interactively by the vLOD system, which takes visibility and level of detail into account.








One of the shortcomings of this system is the layout of roads within the model.  Since the question of node adjacency is complicated by the tree structure of the layout, 2 roads will often coexist right next to each other on the borders of 2 neighboring zones.

One shortcoming of the zone model was its inconsistency with the heightmap.  In a previous version of the system, individual zones were rendered in space and constricted to planar shapes.  This restriction caused some corners of zones to have heights drastically different from the heightmap, causing holes in the model.  In addition, the linear sides of zones did not follow the irregular heightmap, and could not be subdivided without adding significant complexity to the shape grammar.


Another difficulty was raising the roads to the level of the underlying heightmap.  Due to the subdivision scheme employed by the layout generator and the planar nature of triangles, each road coincides with the heightmap only at each of its vertices, and hence only approximates the shape of the heightmap.  This means that pieces of the road may be submerged underneath the land of the heightmap, a visual artifact that is not consistent with reality.  Attempting to remedy this problem by raising the roads slightly does not adequately address the problem; submerging can still occur and the realism of the model at a small scale is affected once the height offset approaches a value close to the height of a house or the width of a road.

This problem could be solved by rendering the roads to a texture map to be applied to the heightmap, rather than as individual polygons.  This solution is consistent with physical intuition and models roads as stripes of paint on a solid surface.




Conclusions and Future Work


One possible extension to the model population system would be to vary the selection of models used to populate a given area, instead of just using rows and columns of the same model.  A useful means of doing so might include a collection of models classified as belonging to one or more zone types from which the population algorithm is able to browse and select at random.


The model could be extended to create small side streets between the rows and columns of a particular zone, and road signs and parking lots.


The grammar implementation could be extended to have one monolithic set of shape rules, which can be re-applied to different zoning designations by referring to the desired split rule within a zoning rule.  This extension would significantly enhance the simplicity, elegance and readability of the grammar files used by the system to generate cities.


Another possible extension of this work would be to keep all shape subdivision data in a winged edge list.  This would require the writing of functions to update adjacency information after a split grammar rule application, and would greatly simplify the question of adjacency, allowing neighboring nodes to interact.  This extension may allow for 2 neighboring Town nodes to place a large highway in between their common edges rather than 2 adjacent roads.












[1] P. Wonka, M. Wimmer, F. Sillion, W. Ribarsky.  Instant Architecture.  Siggraph July 2003 Vol 22 Issue 3.


[2] Y. Parish, P. Muller.  Procedural Modeling of Cities.  August 2001 Proceedings of 28th Annual Conference on Computer Graphics and Interactive Techniques.


[3] P.A.Flack, J.Wilmott, S.P.Browne, D.B.Arnold, A.M.Day.  Scene Assembly for Large Scale Urban Reconstructions.  November 2001 Proceedings of 2001 Conference on Virtual Reality, Archeology and Cultural Heritage.


[4] S. Greuter, J. Parker, N. Stewart, G. Leach.  Real-time Prodecural Generation of ‘Pseudo-Infinite’ Cities.  February 2003 Proceedings of First International Conference on Computer Graphics and Interactive Technologies in Australasia and Southeast Asia.


[5] R.G.Laycock, A.M.Day.  Automatically Generating Large Urban Environments based on the Footprint Data of Buildings.  June 2003 Proceedings of 8th ACM Symposium on Solid Modeling and Applciations.


[6] J.M.Haile. A Flexible Parsing Engine For Lindenmayer Systems.  Student Poster presentation, Siggraph 2002.


[7] P. Lorenz, T Schulze.  Layout Based Model Generation.  Proceedings of the 1995 Winter Simulation Conference.


[8] A. Narkhede, D. Manocha.  Plyogon Triangulation code.